Updated 2018-08-24 02:43:42 by AMG

AMG: [argparse] is a feature-heavy argument parser.

Documentation and test suite to come. For now, look at the big comment at the start of the implementation.

Examples edit

basic example
proc ex1 {args} {
   puts [argparse -inline {
       {-debug -default 0 -value 1}
       {-increment= -default 1}
       {-from= -default 1}
       {-to= -default 10}
       }]
}
% ex1 -debug
debug 1 increment 1 from 1 to 10
% ex1 -from 1.0 -to 100.0 -increment 0.1
from 1.0 to 100.0 increment 0.1 debug 0

another basic example
proc cmdline {args} {
   puts [argparse -inline -long -equalarg {
       {-debug -value true}
       {-h|help}
       {-type= -required}
       {-i|in=}
       {outfile -required}
       }]
}
% cmdline --type mp3 xyzzy
type mp3 outfile xyzzy
% cmdline --type flac --in waltz.flac 
missing required parameter: outfile
% cmdline --type flac --in waltz.flac waltz.ogg
type flac in waltz.flac outfile waltz.ogg
% cmdline --type flac --in waltz.flac waltz.ogg
type flac in waltz.flac outfile waltz.ogg
% cmdline --type flac --in=waltz.flac waltz.ogg
type flac in waltz.flac outfile waltz.ogg

lsort
proc lsort_ {args} {
    puts [argparse -inline {
        {-ascii      -key sort -value text -default text}
        {-dictionary -key sort -value dictionary}
        {-integer    -key sort -value integer}
        {-real       -key sort -value real}
        {-command=   -forbid {ascii dictionary integer real}}
        {-increasing -key order -value increasing -default increasing}
        {-decreasing -key order -value decreasing}
        -indices
        -index=
        -stride=
        -nocase
        -unique
        list
    }]
}
% lsort_ -increasing -real {2.0 1.0}
order increasing sort real list {2.0 1.0}

lsearch
proc lsearch_ {args} {
    puts [argparse -inline {
        {-exact      -key match -value exact}
        {-glob       -key match -value glob -default glob}
        {-regexp     -key match -value regexp}
        {-sorted     -forbid {glob regexp}}
        -all
        -inline
        -not
        -start=
        {-ascii      -key format -value text -default text}
        {-dictionary -key format -value dictionary}
        {-integer    -key format -value integer}
        {-nocase     -key format -value nocase}
        {-real       -key format -value real}
        {-increasing -key order -value increasing -require sorted -default increasing}
        {-decreasing -key order -value decreasing -require sorted}
        {-bisect     -imply -sorted -forbid {all not}}
        -index=
        {-subindices -require index}
        list
        pattern
    }]
}
% lsearch_ -inline -start 1 -exact -bisect {a b c} b
inline {} start 1 match exact bisect {} sorted {} list {a b c} pattern b

example with a catchall and optional arguments
proc dummy {args} {
    puts [argparse -inline {a? b c* d e?}]
}
# note the interaction between the optional arguments and the catchall argument.
# the optional arguments are assigned first before the catchall argument.
% dummy 1 2
b 1 d 2 c {}
% dummy 1 2 3
a 1 b 2 d 3 c {}
% dummy 1 2 3 4
a 1 b 2 d 3 e 4 c {}
% dummy 1 2 3 4 5
a 1 b 2 c 3 d 4 e 5
% dummy 1 2 3 4 5 6
a 1 b 2 c {3 4} d 5 e 6

One feature not shown by the examples is setting or linking variables. I'm just using the -inline mode for display purposes.

Ideas edit

Automatic command usage error message

At least when not given an explicit argument list and instead relying on the caller's args variable, I'd like for this command to generate the usage error message for the procedure that's calling it. To make this work, it will need to check the name and argument list of its calling procedure. This won't give the right result in every case though; for example, the caller might be applying logic of its own before calling [argparse]. Therefore, additional switches may be needed to enable, disable, or fine-tune the help text.

Basically, [argparse] would take the place of Tcl_WrongNumArgs() which otherwise would never be called when the proc's parameter list is "args".

Help text generation

A -help or -usage switch could be added to the element definition syntax to supply usage information for each switch or parameter. Using this text, nicely-formatted help text can be produced, suitable for display by a command-line program. There's more to it than this though. Some overall help text would need to be supplied to describe the behavior of the program at a higher level, as opposed to simply describing each argument. Plus there's the question of knowing how and when to display this text.

Test suite

Yup, definitely need to write one. There are a lot of complicated features and corner cases to exercise. I have a partial test suite for [argparse], though it only covers parameters and not switches. I'll have to update it for new syntax, expanded capability, and the features I didn't test in the first place.

C API

I plan to rewrite this code in C and provide a stubs-enabled C API, then create a script binding for same. I doubt bytecode optimization will provide any benefit if it's already all rolled up in a single function. Though, more likely it will be several functions: one to parse the element definition list, another to parse an argument list using a pre-parsed element definition list.

One thing that will be needed alongside the script binding is a new internal representation type to cache the parsed element definition list.

Borrow features from [parse_args]

  • -boolean would work the same as -value 1 -default 0.
  • Rename -key to -name, though this makes documentation a bit harder.
  • Automatically set the default -value to match the element name when multiple elements share a -key.
  • Add -validate and -enum.

There are two possibilities for the syntax of -validate.

One: Command prefix to which the argument being validated is appended. The resulting command is then executed. If it evaluates to anything resembling false, or if it has an exceptional (non-OK, non-return) result, then validation fails. Otherwise validation succeeds.

Two: Expression with a prearranged variable, e.g. $arg, mapping to the argument being validated.

Let's try a few possible validations and see how they look using each syntax.
Condition Command Prefix Expression
Boolean string is boolean -strict [string is boolean -strict $arg]
Non-empty list llength [llength $arg]
Enumeration apply {arg {expr {$arg in {a b c}}}} $arg in {a b c}
Positive tcl::mathop::< 0 $arg > 0

Both syntaxes work, and it's possible to harness the power of either no matter which syntax is chosen. I think the choice comes down to which facilitates switching more easily. The above shows it's easier to evaluate a command prefix within an expression context than it is to evaluate an expression within a command prefix context. Thus, at this point I'm leaning heavily toward using expressions, which is in contrast to the [parse_args] approach.

Thanks to >'s unwanted string comparison mode, the above "positive" validation doesn't actually check if the argument is a number; if it's not a number, it checks if it sorts after "0". To complete the check, the following expression is needed: [string is double -strict $arg] && $arg > 0. Written as a command prefix: apply {arg {expr {[string is double -strict $arg] && $arg > 0}}}. Cumbersome.

Support switch clustering

Elsewhere on this page, bll mentioned switch clustering, e.g. "ls -lA". I can add support for this, though it will preclude using a single - for switch names longer than a single character. Instead, only the -long form (--) would be allowed for longer names.

Bugs edit

In updating an existing application to use this code, I found and fixed a few bugs and feature gaps, but it's quite possible more issues remain. Please report anything you find right here.

References edit

Command Option Parsing
Has more references and links to discussion
extending the notation of proc args
Backward-compatible proposal to incorporate similar functionality into the proc parameter list
parse_args
Very similar functionality to [argparse], implemented in C

Discussion edit

 Discussion

Arguments in many places

bll 2018-8-23: Arguments can appear in many places: The command line, in an environment variable, global configuration files, user configuration files. (1) The argument processing should have the flexibility of doing:
# overly simplistic pseudo-code
argparse $dataFromGlobalConfig
argparse $dataFromLocalConfig
argparse $::env(MYPROG_ENV)
argparse $::argv

and end up with a single set of options.

AMG: These all would work if you simply inserted the element definition as the initial argument. For command line parsing, consider using -long, -equalarg, and -mixed to more closely resemble the switch syntax supported by most common Unix-like commands.

Counting duplicate arguments

bll: (2) I would always like to see some sort of ability to supply duplicate arguments that increment a value. e.g. -v -v -v is often seen on a command line to increase a verbosity level.

AMG: At present, the only support for duplicate arguments is via -pass, then the caller can do secondary parsing of the pass-through variable or dict value.
argparse {{-v -pass v}}
set v [llength $v]

I've considered adding special support for this usage, but the above isn't so bad.

Option aliases

bll: I'm just looking at some option code I wrote (not for Tcl) and what else would be nice. (3) Option aliases, I think you can support already with: -D -imply debug or -D -require debug.

AMG: There's -alias or its | shorthand. [argparse {-D|debug}] will recognize -D, -d, -de, -deb, -debu, -debug. ([argparse {-D|debug -define}] will not recognize -d or -de as ways of writing -debug.) Try this in combination with -pass and -normalize, by the way.

bll: I hate prefixes and would always use -exact. Ok, figured out the syntax. I find it rather confusing. -D= -alias debug specifies that debug is an alias of it, not that -D is an alias of -debug. This is backwards in my mind. I think I would use: -D= -hasalias debug or -debug -alias D. Well, this one will drive me crazy if I need it.
% set args [list -D=5]
-D=5
% argparse -equalarg -exact -template apopts(%) {{-debug=} {-D= -alias debug}}
% parray apopts
apopts(D) = 5

AMG: I'm not sure I understand the complaint here. You say you would prefer "-debug -alias D" but that is exactly what is expected and supported. Saying "-D -alias debug" is indeed backwards. As you point out, the key will default to D rather than debug. The way it works is the first element is the switch or parameter name, then subsequent elements modify the definition.

I recommend the shorthand: "-D|debug=" means the same as "-debug= -alias D". The longest possible form is "debug -switch -alias D -argument". But use the shorthand. Shorthand is good. The long form exists solely to regularize the internals. The previous version of this code only supported shorthand, and it meant the internals were searching strings for single-character flags, plus had no room for expansion. But obviously the long form is too verbose for typical use. Thus I allow both syntaxes.

Why do I list the alias first in the shorthand? Because the alias is almost always a single character, and this results in a neater display. Personal preference. Sorry if this contributes to confusion.

bll: Our minds work differently. I read -alias as is-an-alias-of.

AMG: Here, have a real-life example, modified a bit from the production form:
argparse -mixed {
    {-s|slocs           -key fileMode -value slocs}
    {-t|total           -key fileMode -value total}
    {-d|density         -key fileMode -value density}
    {-l|language        -key fileMode -value language}
    {-f|filename        -key fileMode -value filename -default filename}
    {-r|reverse         -key fileReverse}
    {-o|omit            -key fileMode -value omit}
    {-S|summarySlocs    -key summaryMode -value slocs}
    {-T|summaryTotal    -key summaryMode -value total}
    {-D|summaryDensity  -key summaryMode -value density}
    {-L|summaryLanguage -key summaryMode -value language -default language}
    {-C|summaryCount    -key summaryMode -value count}
    {-R|summaryReverse  -key summaryReverse}
    {-O|summaryOmit     -key summaryMode -value omit}
    -n|noRecurse
    -x|exclude=
    -g|debug
    argv*
} $argv

Hopefully this should clarify how aliases are intended to be written. Basically, don't use -alias, rather use the shorthand.

Clustering

bll: (4) Possibly a legacy mode, where -ab = -a -b instead of -a b. I don't know if this is necessary.

AMG: I believe you're talking about switch clustering, which is a nice feature for compactness and command line parsing, but it collides very badly with long options. The existence of switch clustering is the whole reason we have --switch long option syntax. I could create a -cluster switch which would imply -long and make single - support only single-character unique prefixes (presumably from aliases) but allow multiple per argument.

I don't know what you mean by -a b. In context I would guess "the switch whose name is the two-character sequence ab" but it's not clear.

bll: If there is no clustering, -ab is the same as -a=b or -a b (old unix style). I believe the programming community is moving away from switch clustering. I would definitely lump it into the legacy category. I cannot recall if the community is moving away from the -ab no-space syntax (I think so).

AMG: Ah, I completely forgot about having the argument immediately follow the single-character switch name. I'm not keen on supporting that, but it does turn up in a lot of places. Example: -I/usr/local/include.

Ignoring options

bll: (5) Be able to specify an option that is just ignored.

AMG: Use -ignore: [argparse {{-foo -ignore} -bar}] will parse both -foo and -bar but will only set the variable bar.

Switch and parameter arrays

bll 2018-8-23: In extending the notation of proc args an example is given where the options are stored in local variables: e.g. ${-start}. I do not like this at all. I would very much prefer to specify and access an array: $apopts(-start). Then I can check the options from other procedures in my program.

AMG: There are numerous ways to fine-tune where the switch and parameter values are stored.
argparse -template apopts(%) {-start= -exact}
Switches go in apopts array, having keys "start" and "exact"
argparse -template apopts(-%) {-start= -exact}
Switches go in apopts array, having keys "-start" and "-exact"
argparse {{-start= -key apopts(-start)} {-exact= -key apopts(-exact)}}
Switches go in apopts array, having keys "-start" and "-exact"
set apopts [argparse -inline {-start= -exact}]
Switches go in apopts dict, having keys "start" and "exact"

While arrays and dicts work, they do have the drawback of not being compatible with -upvar because Tcl does not allow for an array element or dict value to be a link to another variable.

Saying -template apopts(-%) is cheating because it prefixes all array element names with -, both switches and parameters. But if you're only using switches, and you really want that - in there, it's fine. There is not the ability to set a different template for switches than for parameters, but you are able to individually set the -key of each element.

bll: With my simple option parser I use, I often specify some parameters as -parameter <value>, so that's not a big issue. I sort of like the - in front, but that's not an issue.

Auto defaulting of simple switches

bll: RFE: I would like an option to auto-default simple switches. If the switch is not specified, a false value would always be returned.
# hypothetical example
% set args [list]
% unset -nocomplain apopts
% argparse -template apopts(%) -simple false {-test1 -stride=}
% parray aptopts
apopts(test1) = false

If you add -boolean as stated from parse_args, this would not be needed.

AMG: Yeah, -boolean is probably what you want since you're only addressing half the issue. The "default -default" is for the array element (or whatever) to not even be created in the first place, but the "default -value" (i.e. the value used for a switch lacking an argument) is empty string. It would be very strange to have the choices be false and empty string.

I think I would have -boolean be permitted both as a switch modifying individual switches and as a switch applying to the entire [argparse] command. In the latter case it would change the default -default and -value to 0 and 1 for switches that lack -argument (or -optional, -required, and -catchall, all of which imply -argument when used with -switch; also note that these all have shorthand syntaxes you're more likely to use).

Easy testing if a parameter was specified

bll: Another RFE: I also have in mine a very simple way of testing whether a switch was specified at all, so instead of many [info exists opts(-weburl)], I simply do if { $opts(-weburl) } ... and access the argument with $opts(-weburl.arg).

This could be reversed:
# hypothetical example and I may have wrong syntax 
set args {}
puts [argparse -inline -specifyexists -stride=]
stride.exists false
set args -stride=5
puts [argparse -inline -specifyexists -stride=]
stride.exists true stride 5

AMG: So you're suggesting having separate keys to track existence and value. I don't have a problem with [info exists] and frequently use the presence or absence of a variable to signal a boolean, particularly if there's additional detail attached to a "true" condition, most especially if there is no value I can reserve to indicate "false". But if you prefer another style then that can be accommodated.

This is interesting in combination with "-optional" switches (i.e. tri-state switches that can either be omitted altogether, be supplied as the final argument and have no argument of their own, or be present and given an argument). Currently, such switches are either omitted from the result, have empty string as their value, or have a single-element dict mapping from empty string to the real value. With -inline this becomes a bit more natural:
Script Return value
argparse -inline {-foo?} {} {}
argparse -inline {-foo?} {-foo} foo {}
argparse -inline {-foo?} {-foo bar} foo {{} bar}

This allows:
set opt [argparse -inline {-foo?}]
dict exists $opt foo       ;# Checks if -foo was specified
dict exists $opt foo {}    ;# Checks if -foo was specified and given a value
dict get $opt foo {}       ;# Returns the value of -foo or throws an error if not given one

With your proposal, the above would instead be:
Script Return value
argparse -inline -specifyexists {-foo?} {} foo 0
argparse -inline -specifyexists {-foo?} {-foo} foo 1
argparse -inline -specifyexists {-foo?} {-foo bar} foo 1 foo.arg bar
set opt [argparse -inline {-foo?}]
dict get $opt foo          ;# Checks if -foo was specified
dict exists $opt foo.arg   ;# Checks if -foo was specified and given a value
dict get $opt foo.arg      ;# Returns the value of -foo or throws an error if not given one

bll: I have to think about this. Whereas the utility is nice, I don't want to create more code mess to support something that might only be used by one person. Let me roll this around in the back of my head for a few days.

Mixing - and --

bll: Now this surprised me:
cmdline -type flac --in waltz.flac waltz.ogg

I have never seen -option and --option mixed on the command line before. Slightly unusual. If you do end up supporting options with no space, e.g. -I/usr/include/local, this will probably go away.

AMG: Having both in the same command line is admittedly weird, but rejecting it would be arbitrary and useless. Possibly the command line was built up in parts coming from different places. Without -long, both would have to be -type and -in. With -long, the above is valid, though the expectation is that the user would pick one style and stick with it. With -cluster (doesn't exist yet), the caller would have to use --type and --in, or single-character aliases/prefixes, e.g. -t and -i.

Related: I'm considering adding an -appendarg switch to go along with -equalarg to allow the argument to be directly appended to single-character switch names. This would make -I/usr/local/include possible.

-name

bll: I like -key. You could make -name an alias of -key.

-validate

bll: Keep it simple. And then also allow validation by a user defined procedure. Then every time someone wants yet another validation type, you can point out that they can write their own.

I think you could just have the argument to -validate be one of the [string is] types. That covers the basic validations.

AMG: I plan to adopt [parse_args]'s approach.

Right now I only have the ability to validate the presence or absence of switches, parameters, and basic combinations thereof, using -required, -optional, -require, -reciprocal, -forbid, and -imply. The arguments can be anything.

Of course there will always be complex cases, for example a certain switch can only be used when another switch has a certain value. I'm not going to directly support such things because the caller can always check for anything it wants after [argparse] returns. I just wanted to deal with the most obvious situations, and I feel -validate and -enum qualify.

Though, one extra thing I could do to -enum over and above basic validation, mostly to tease you but also because it's consistent with Tcl's general love of prefix matching, would be to accept any unambiguous prefixes of any of the listed enumerations and to set the value to the full match string. For example, -enum {1 0 yes no true false on off} would replace y with yes. This approximates Tcl_GetBoolean() [1], aside from only accepting lowercase. But this specific example would be better implemented as -validate {string is boolean} which accepts all the same inputs plus uppercase versions and leaves the value untouched because it'll presumably get interpreted later by code that understands booleans.

Biased testimonial

AMG: Take it for what it's worth, since I'm just talking about my own code, but lately I found that having this functionality (a more limited predecessor version) has transformed the way I program in Tcl. I wish I had written this code years ago, but I've only had it for a month or two. I'm now free to make more flexible procedures that take many arguments, no longer having to worry about the complexity of argument parsing or the nightmare of long argument lists. I don't have to make contorted syntaxes that always put the "args" parameter at the end if an alternative would be more natural (e.g. a list of switches up at the front) since [argparse] lets optional, defaulted, and catchall arguments appear anywhere. In addition to defaulted arguments, I have optional arguments for which I don't need to worry about picking some default value that will never show up in normal usage; just check [info exists] to see if they were passed or not. It's now much easier to link to caller variables: just tack ^ on the end of the switch or parameter name. I'm now even using [argparse] for variadic data structures. [dict] would have sufficed for that purpose, except [dict] is a more rigid format.

Because of how much I'm using it these days for professional work, I'm definitely interested in making this code faster.

bll: My apologies for not reading everything thoroughly. Though the clarifications help. I am looking forward to seeing it as a package and in C form.

I definitely agree with you. I made a very simple option parser for my application, and it has helped a lot with making the code clearer and like you I wish I had written it a little sooner.

Unfortunately, everybody has their own way of parsing their command lines, and unless every possible ability is supported, it is hard for an option/argument parser to gain traction.

AMG: Agreed. I feel it's a losing proposition no matter what. If the parser is too simplistic, it won't be useful enough to even bother publishing. If it supports too much stuff, it's too daunting and people either don't read the whole documentation or give up without trying. And if it's somewhere in the middle, it's still not flexible enough for many cases since there are so many possible syntaxes in the wild. Thus, I'm pretty well forced to implement the kitchen sink and accept the consequences of bloat. Because it's not merely my code that's bloated; it's the problem space that's bloated, and I'm just dealing with it.

One thing I think will be helpful is a gentle introduction, one I have utterly failed to provide. Accompanying my code I wrote a very long yet dense comment giving a complete reference but not real examples. Then I dumped this code on the wiki and didn't spend much additional time writing tutorial material. All I did was write a few complex examples showing [lsort] and [lsearch]. And now I'm afraid to add more material to this page because it's so long already.

If I port this to C, it will be accompanied by both an introduction and a reference.

Compatibility edit

This code is written for Tcl 8.6 and newer.

If you want to use this with Tcl 8.5, you will need:

For Tcl 8.4, you will need:

Code edit

# argparse --
# Parses an argument list according to a definition list.  The result may be
# stored into caller variables or returned as a dict.
#
# The [argparse] command accepts the following switches:
#
# -inline        Return the result dict rather than setting caller variables
# -exact         Require exact switch name matches, and do not accept prefixes
# -mixed         Allow switches to appear after parameters
# -long          Recognize "--switch" long option alternative syntax
# -equalarg      Recognize "-switch=arg" inline argument alternative syntax
# -normalize     Normalize switch syntax in pass-through result keys
# -reciprocal    Every element's -require constraints are reciprocal
# -level LEVEL   Every -upvar element's [upvar] level; defaults to 1
# -template TMP  Transform default element names using a substitution template
# -pass KEY      Pass unrecognized elements through to a result key
# --             Force next argument to be interpreted as the definition list
#
# After the above switches comes the definition list argument, then finally the
# optional argument list argument.  If the argument list is omitted, it is taken
# from the caller's args variable.
#
# Each element of the definition list is itself a list containing a unique,
# non-empty name element consisting of alphanumerics, underscores, and minus
# (not as the first character), then zero or more of the following switches:
#
# -switch        Element is a switch; conflicts with -parameter
# -parameter     Element is a parameter; conflicts with -switch
# -alias ALIAS   Alias name; requires -switch
# -ignore        Element is omitted from result; conflicts with -key and -pass
# -key KEY       Override key name; not affected by -template
# -pass KEY      Pass through to result key; not affected by -template
# -default VAL   Value if omitted; conflicts with -required
# -value VAL     Value if present; requires -switch; conflicts with -argument
# -argument      Value is next argument following switch; requires -switch
# -optional      Switch value is optional, or parameter is optional
# -required      Switch is required, or stop -catchall from implying -optional
# -catchall      Value is list of all otherwise unassigned arguments
# -upvar         Links caller variable; conflicts with -inline and -catchall
# -level LEVEL   This element's [upvar] level; requires -upvar
# -standalone    If element is present, ignore -required, -require, and -forbid
# -require LIST  If element is present, other elements that must be present
# -forbid LIST   If element is present, other elements that must not be present
# -imply LIST    If element is present, extra switch arguments; requires -switch
# -reciprocal    This element's -require is reciprocal; requires -require
#
# If neither -switch nor -parameter are used, a shorthand form is permitted.  If
# the name is preceded by "-", it is a switch; otherwise, it is a parameter.  An
# alias may be written after "-", then followed by "|" and the switch name.  The
# element name may be followed by any number of flag characters:
#
# "="            Same as -argument; only valid for switches
# "?"            Same as -optional
# "!"            Same as -required
# "*"            Same as -catchall
# "^"            Same as -upvar
#
# -default specifies the value to assign to element keys when the element is
# omitted.  If -default is not used, keys for omitted switches and parameters
# are omitted from the result, unless -catchall is used, in which case the
# default value for -default is empty string.
#
# At most one parameter may use -catchall.
#
# If multiple elements share the same -key value, they automatically are given
# -forbid constraints to prevent them from being used simultaneously.  Elements
# sharing the same -key value cannot use -catchall, nor can they use -default
# multiple times.
#
# -value specifies the value to assign to switch keys when the switch is
# present.  -value may not be used with -argument, -optional, -required, or
# -catchall.  -value defaults to empty string.  -value is especially useful in
# combination with multiple elements sharing the same -key.
#
# -optional, -required, and -catchall imply -argument when used with -switch.
#
# If -argument is used, the value assigned to the switch's key is normally the
# next argument following the switch.  With -catchall, the value assigned to the
# switch's key is instead the list of all remaining arguments.  With -optional,
# the following processing is applied:
#
# - If the switch is not present, the switch's key is omitted from the result.
# - If the switch is not the final argument, its value is a two-element list
#   containing empty string followed by the argument after the switch.
# - If the switch is the final argument, its value is empty string.
#
# By default, switches are optional and parameters are required.  Switches can
# be made required with -required, and parameters can be made optional with
# -optional.  -catchall also makes parameters optional, unless -required is
# used, in which case at least one argument must be assigned to the parameter.
# Otherwise, using -required with -parameter has no effect.  -switch -optional
# -required means the switch must be present but may be the final argument.
#
# When -switch and -optional are used, -catchall, -default, and -upvar are
# disallowed.  -parameter -optional -required is also a disallowed combination.
#
# Unambiguous prefixes of switch names are acceptable, unless the -exact switch
# is used.  Switches in the argument list normally begin with a single "-" but
# can also begin with "--" if the -long switch is used.  Arguments to switches
# normally appear as the list element following the switch, but if -equalarg is
# used, they may be supplied within the switch element itself, delimited with an
# "=" character, e.g. "-switch=arg".
#
# The per-element -pass switch causes the element argument or arguments to be
# appended to the value of the indicated pass-through result key.  Many elements
# may use the same pass-through key.  If -normalize is used, switch arguments
# are normalized to not use aliases, abbreviations, the "--" prefix, or the "="
# argument delimiter; otherwise, switches will be expressed the same way they
# appear in the original input.  If -mixed is used, pass-through keys will list
# all switches first before listing any parameters.  If no arguments are
# assigned to a pass-through key, its value will be empty string.
#
# The [argparse] -pass switch may be used to collect unrecognized arguments into
# a pass-through key, rather than failing with an error.  Normalization and
# unmixing will not be applied to these arguments because it is not possible to
# reliably determine if they are switches or parameters; in particular, it is
# not known if an undefined switch expects an argument.
#
# [argparse] produces a set of keys and values.  The keys are the names of
# caller variables into which the values are stored, unless -inline is used, in
# which case the key-value pairs are returned as a dict.  The element names
# default to the key names, unless overridden by -key, -pass, or -template.  If
# both -key and -pass are used, two keys are defined: one having the element
# value, the other having the pass-through elements.
#
# -template applies to elements using neither -key nor -pass.  Its value is a
# substitution template applied to element names to determine key names.  "%" in
# the template is replaced with the element name.  To protect "%" or "\" from
# replacement, precede it with "\".  One use for -template is to put the result
# in an array, e.g. with "-template arrayName(%)".
#
# Elements with -upvar are special.  Rather than having normal values, they are
# bound to caller variables using the [upvar] command.  -upvar conflicts with
# -inline because it is not possible to map a dict value to a variable.  Due to
# limitations of arrays and [upvar], -upvar cannot be used with keys whose names
# resemble array elements.  -upvar conflicts with -catchall because the value
# must be a variable name, not a list.  The combination -switch -optional -upvar
# is disallowed for the same reason.  If -upvar is used with switches or with
# optional parameters, [info exists KEY] returns 1 both when the element is not
# present and when its value is the name of a nonexistent variable.  To tell the
# difference, check if [info vars KEY] returns an empty list; if so, the element
# is not present.  Note that the argument to [info vars] is a [string match]
# pattern, so it may be necessary to precede *?[]\ characters with backslashes.
#
# Argument processing is performed in three stages: switch processing, parameter
# allocation, and parameter assignment.  Each argument processing stage and pass
# is performed left-to-right.
#
# All switches must normally appear in the argument list before any parameters.
# Switch processing terminates with the first argument (besides arguments to
# switches) that does not start with "-" (or "--", if -long is used).  The
# special switch "--" can be used to force switch termination if the first
# parameter happens to start with "-".  If no switches are defined, the first
# argument is known to be a parameter even if it starts with "-".
#
# When the -mixed switch is used, switch processing continues after encountering
# arguments that do not start with "-" or "--".  This is convenient but may be
# ambiguous in cases where parameters look like switches.  To resolve ambiguity,
# the special "--" switch terminates switch processing and forces all remaining
# arguments to be parameters.
#
# After switch processing, parameter allocation determines how many arguments to
# assign to each parameter.  Arguments assigned to switches are not used in
# parameter processing.  First, arguments are allocated to required parameters;
# second, to optional, non-catchall parameters; and last to catchall parameters.
# Finally, each parameter is assigned the allocated number of arguments.
proc argparse {args} {
    # Process switches and locate the definition argument.
    set level 1
    for {set i 0} {$i < [llength $args]} {incr i} {
        if {[lindex $args $i] eq "--"} {
            # Stop after "--".
            incr i
            break
        } elseif {[catch {
            regsub {^-} [tcl::prefix match -message switch {
                -equalarg -exact -inline -level -long -mixed -normalize -pass
                -reciprocal -template
            } [lindex $args $i]] {} switch
        }]} {
            # Stop at the first non-switch argument.
            break
        } elseif {$switch ni {level pass template}} {
            # Process switches with no arguments.
            set $switch {}
        } elseif {$i == [llength $args] - 1} {
            return -code error "-$switch requires an argument"
        } else {
            # Process switches with arguments.
            set $switch [lindex $args [incr i]]
        }
    }

    # Extract the definition and args parameters from the argument list, pulling
    # from the caller's args variable if the args parameter is omitted.
    switch [expr {[llength $args] - $i}] {
    0 {
        return -code error "missing required parameter: definition"
    } 1 {
        set definition [lindex $args end]
        set argv [uplevel 1 {::set args}]
    } 2 {
        set definition [lindex $args end-1]
        set argv [lindex $args end]
    } default {
        return -code error "too many arguments"
    }}

    # Parse element definition list.
    set def {}
    set aliases {}
    set order {}
    set switches {}
    set upvars {}
    foreach elem $definition {
        # Read element definition switches.
        set opt {}
        for {set i 1} {$i < [llength $elem]} {incr i} {
            if {[set switch [regsub {^-} [tcl::prefix match {
                -alias -argument -catchall -default -forbid -ignore -imply -key
                -level -optional -parameter -pass -reciprocal -require -required
                -standalone -switch -upvar -value
            } [lindex $elem $i]] {}]] ni {
                alias default forbid imply key pass require value
            }} {
                # Process switches without arguments.
                dict set opt $switch {}
            } elseif {$i == [llength $elem] - 1} {
                return -code error "-$switch requires an argument"
            } else {
                # Process switches with arguments.
                incr i
                dict set opt $switch [lindex $elem $i]
            }
        }

        # Process the first element of the element definition.
        if {![llength $elem]} {
            return -code error "element definition cannot be empty"
        } elseif {[dict exists $opt switch] && [dict exists $opt parameter]} {
            return -code error "-switch and -parameter conflict"
        } elseif {![dict exists $opt switch] && ![dict exists $opt parameter]} {
            # If -switch and -parameter are not used, parse shorthand syntax.
            if {![regexp -expanded {
                ^(?:(-)             # Leading switch "-"
                (?:(\w[\w-]*)\|)?)? # Optional switch alias
                (\w[\w-]*)          # Switch or parameter name
                ([=?!*^]*)$         # Optional flags
            } [lindex $elem 0] _ minus alias name flags]} {
                return -code error "bad element shorthand: [lindex $elem 0]"
            }
            if {$minus ne {}} {
                dict set opt switch {}
            } else {
                dict set opt parameter {}
            }
            if {$alias ne {}} {
                dict set opt alias $alias
            }
            foreach flag [split $flags {}] {
                dict set opt [dict get {
                    = argument ? optional ! required * catchall ^ upvar
                } $flag] {}
            }
        } elseif {![regexp {^\w[\w-]*$} [lindex $elem 0]]} {
            return -code error "bad element name: [lindex $elem 0]"
        } else {
            # If exactly one of -switch or -parameter is used, the first element
            # of the definition is the element name, with no processing applied.
            set name [lindex $elem 0]
        }

        # Check for collisions.
        if {[dict exists $def $name]} {
            return -code error "element name collision: $name"
        }

        if {[dict exists $opt switch]} {
            # For switches, -optional, -required, and -catchall imply -argument.
            foreach switch {optional required catchall} {
                if {[dict exists $opt $switch]} {
                    dict set opt argument {}
                }
            }
        } else {
            # Parameters are required unless -catchall or -optional are used.
            if {([dict exists $opt catchall] || [dict exists $opt optional])
             && ![dict exists $opt required]} {
                dict set opt optional {}
            } else {
                dict set opt required {}
            }
        }

        # Check requirements and conflicts.
        foreach {switch other} {reciprocal require   level upvar} {
            if {[dict exists $opt $switch] && ![dict exists $opt $other]} {
                return -code error "-$switch requires -$other"
            }
        }
        foreach {switch others} {
            parameter {alias value argument imply}
            ignore    {key pass}
            required  default
            argument  value
            upvar     {inline catchall}
        } {
            if {[dict exists $opt $switch]} {
                foreach other $others {
                    if {[dict exists $opt $other]} {
                        return -code error "-$switch and -$other conflict"
                    }
                }
            }
        }
        if {[dict exists $opt upvar] && [info exists inline]} {
            return -code error "-upvar and -inline conflict"
        }

        # Check for disallowed combinations.
        foreach combination {
            {switch optional catchall}
            {switch optional upvar}
            {switch optional default}
            {parameter optional required}
        } {
            foreach switch [list {*}$combination {}] {
                if {$switch eq {}} {
                    return -code error "[join [lmap switch $combination {
                        string cat - $switch
                    }]] is a disallowed combination"
                } elseif {![dict exists $opt $switch]} {
                    break
                }
            }
        }

        # Compute default output key if -ignore, -key, and -pass aren't used.
        if {![dict exists $opt ignore] && ![dict exists $opt key]
         && ![dict exists $opt pass]} {
            if {[info exists template]} {
                dict set opt key [string map\
                        [list \\\\ \\ \\% % % $name] $template]
            } else {
                dict set opt key $name
            }
        }

        if {[dict exists $opt parameter]} {
            # Keep track of parameter order.
            lappend order $name

            # Forbid more than one catchall parameter.
            if {[dict exists $opt catchall]} {
                if {[info exists catchall]} {
                    return -code error "multiple catchall parameters:\
                            $catchall and $name"
                } else {
                    set catchall $name
                }
            }
        } elseif {![dict exists $opt alias]} {
            # Build list of switches.
            lappend switches -$name
        } elseif {![regexp {^\w[\w-]*$} [dict get $opt alias]]} {
            return -code error "bad alias: [dict get $opt alias]"
        } elseif {[dict exists $aliases [dict get $opt alias]]} {
            return -code error "element alias collision: [dict get $opt alias]"
        } else {
            # Build list of switches (with aliases), and link switch aliases.
            dict set aliases [dict get $opt alias] $name
            lappend switches -[dict get $opt alias]|$name
        }

        # Map from upvar keys back to element names, and forbid collisions.
        if {[dict exists $opt upvar] && [dict exists $opt key]} {
            if {[dict exists $upvars [dict get $opt key]]} {
                return -code error "multiple upvars to the same variable:\
                        [dict get $upvars [dict get $opt key]] $name"
            }
            dict set upvars [dict get $opt key] $name
        }

        # Save element definition.
        dict set def $name $opt
    }

    # Process constraints.
    dict for {name opt} $def {
        # Verify constraint references.
        foreach constraint {require forbid} {
            if {[dict exists $opt $constraint]} {
                foreach otherName [dict get $opt $constraint] {
                    if {![dict exists $def $otherName]} {
                        return -code error "$name -$constraint references\
                                undefined element: $otherName"
                    }
                }
            }
        }

        # Create reciprocal requirements.
        if {([info exists reciprocal] || [dict exists $opt reciprocal])
          && [dict exists $opt require]} {
            foreach other [dict get $opt require] {
                dict update def $other otherOpt {
                    dict lappend otherOpt require $name
                }
            }
        }

        # Disallow use of -catchall and multiple -default in combination with
        # shared keys, and create forbid constraints on shared keys.
        if {[dict exists $opt key]} {
            dict for {otherName otherOpt} $def {
                if {$name ne $otherName && [dict exists $otherOpt key]
                 && [dict get $otherOpt key] eq [dict get $opt key]} {
                    if {[dict exists $opt catchall]} {
                        return -code error "$name cannot use -catchall because\
                                it shares a key with $otherName"
                    } elseif {[dict exists $opt default]
                           && [dict exists $otherOpt default]} {
                        return -code error "$name and $otherName cannot both\
                                use -default because they share a key"
                    } elseif {![dict exists $otherOpt forbid]
                           || $name ni [dict get $otherOpt forbid]} {
                        dict update def $otherName otherOpt {
                            dict lappend otherOpt forbid $name
                        }
                    }
                }
            }
        }
    }

    # Handle default pass-through switch by creating a dummy element.
    if {[info exists pass]} {
        dict set def {} pass $pass
    }

    # Perform switch logic before doing anything with parameters.
    set result {}
    set missing {}
    if {[llength $switches]} {
        # Build regular expression to match switches.
        set re ^-
        if {[info exists long]} {
            append re -?
        }
        append re {(\w[\w-]*)}
        if {[info exists equalarg]} {
            append re (?:(=)(.*))?
        } else {
            append re ()()
        }
        append re $

        # Process switches, and build the list of parameter arguments.
        set params {}
        while {[llength $argv]} {
            # Check if this argument appears to be a switch.
            set argv [lassign $argv arg]
            if {[regexp $re $arg _ name equal val]} {
                # This appears to be a switch.  Fall through to the handler.
            } elseif {$arg eq "--"} {
                # If this is the special "--" switch to end all switches, all
                # remaining arguments are parameters.
                set params $argv
                break
            } elseif {[info exists mixed]} {
                # If -mixed is used and this is not a switch, it is a parameter.
                # Add it to the parameter list, then go to the next argument.
                lappend params $arg
                continue
            } else {
                # If this is not a switch, terminate switch processing, and
                # process this and all remaining arguments as parameters.
                set params [linsert $argv 0 $arg]
                break
            }

            # Process switch aliases.
            if {[dict exists $aliases $name]} {
                set name [dict get $aliases $name]
            }

            # Preliminary guess for the normalized switch name.
            set normal -$name

            # Perform switch name lookup.
            if {[dict exists $def $name switch]} {
                # Exact match.  No additional lookup needed.
            } elseif {![info exists exact] && ![catch {
                tcl::prefix match -message switch [lmap {key data} $def {
                    if {[dict exists $data switch]} {
                        set key
                    } else {
                        continue
                    }
                }] $name
            } name]} {
                # Use the switch whose prefix unambiguously matches.
                set normal -$name
            } elseif {[dict exists $def {}]} {
                # Use default pass-through if defined.
                set name {}
            } else {
                # Fail if this is an invalid switch.
                set switches [lsort $switches]
                if {[llength $switches] > 1} {
                    lset switches end "or [lindex $switches end]"
                }
                set switches [join $switches\
                        {*}[if {[llength $switches] > 2} {list ", "}]]
                return -code error "bad switch \"$arg\": must be $switches"
            }

            # If the switch is standalone, ignore all constraints.
            if {[dict exists $def $name standalone]} {
                foreach other [dict keys $def] {
                    dict unset def $other required
                    dict unset def $other require
                    dict unset def $other forbid
                    if {[dict exists $def $other parameter]} {
                        dict set def $other optional {}
                    }
                }
            }

            # Keep track of which elements are present.
            dict set def $name present {}

            # If the switch value was set using -switch=value notation, insert
            # the value into the argument list to be handled below.
            if {$equal eq "="} {
                set argv [linsert $argv 0 $val]
            }

            # Load key and pass into local variables for easy access.
            unset -nocomplain key pass
            foreach var {key pass} {
                if {[dict exists $def $name $var]} {
                    set $var [dict get $def $name $var]
                }
            }

            # Store the switch value into the caller's variable.
            if {[dict exists $def $name catchall]} {
                # The switch is catchall, so store all remaining arguments.
                if {[info exists key]} {
                    dict set result $key $argv
                }
                if {[info exists pass]} {
                    if {[info exists normalize]} {
                        dict lappend result $pass $normal {*}$argv
                    } else {
                        dict lappend result $pass $arg {*}$argv
                    }
                }
                break
            } elseif {![dict exists $def $name argument]} {
                # The switch expects no arguments.
                if {$equal eq "="} {
                    return -code error "$normal doesn't allow an argument"
                }
                if {[info exists key]} {
                    if {[dict exists $def $name value]} {
                        dict set result $key [dict get $def $name value]
                    } else {
                        dict set result $key {}
                    }
                }
                if {[info exists pass]} {
                    if {[info exists normalize]} {
                        dict lappend result $pass $normal
                    } else {
                        dict lappend result $pass $arg
                    }
                }
            } elseif {[llength $argv]} {
                # The switch was given the expected argument.
                if {[info exists key]} {
                    if {[dict exists $def $name optional]} {
                        dict set result $key [list {} [lindex $argv 0]]
                    } else {
                        dict set result $key [lindex $argv 0]
                    }
                }
                if {[info exists pass]} {
                    if {[info exists normalize]} {
                        dict lappend result $pass $normal [lindex $argv 0]
                    } elseif {$equal eq "="} {
                        dict lappend result $pass $arg
                    } else {
                        dict lappend result $pass $arg [lindex $argv 0]
                    }
                }
                set argv [lrange $argv 1 end]
            } else {
                # The switch was not given the expected argument.
                if {![dict exists $def $name optional]} {
                    return -code error "$normal requires an argument"
                }
                if {[info exists key]} {
                    dict set result $key {}
                }
                if {[info exists pass]} {
                    if {[info exists normalize]} {
                        dict lappend result $pass $normal
                    } else {
                        dict lappend result $pass $arg
                    }
                }
            }

            # Prepend argument list with this switch's implied arguments.
            if {[dict exists $def $name imply]} {
                set argv [concat [dict get $def $name imply] $argv]
                dict unset def $name imply
            }
        }

        # Build list of missing required switches.
        dict for {name opt} $def {
            if {[dict exists $opt switch] && ![dict exists $opt present]
             && [dict exists $opt required]} {
                if {[dict exists $opt alias]} {
                    lappend missing -[dict get $opt alias]|$name
                } else {
                    lappend missing -$name
                }
            }
        }

        # Fail if at least one required switch is missing.
        if {[llength $missing]} {
            set missing [lsort $missing]
            if {[llength $missing] > 1} {
                lset missing end "and [lindex $missing end]"
            }
            set missing [join $missing\
                    {*}[if {[llength $missing] > 2} {list ", "}]]
            return -code error [string cat "missing required switch"\
                    {*}[if {[llength $missing] > 1} {list es}] ": " $missing]
        }
    } else {
        # If no switches are defined, bypass the switch logic and process all
        # arguments using the parameter logic.
        set params $argv
    }

    # Allocate one argument to each required parameter, including catchalls.
    set alloc {}
    set count [llength $params]
    set i 0
    foreach name $order {
        if {[dict exists $def $name required]} {
            if {$count} {
                dict set alloc $name 1
                incr count -1
            } else {
                lappend missing $name
            }
        }
        incr i
    }

    # Fail if at least one required parameter is missing.
    if {[llength $missing]} {
        if {[llength $missing] > 1} {
            lset missing end "and [lindex $missing end]"
        }
        return -code error [string cat "missing required parameter"\
                {*}[if {[llength $missing] > 1} {list s}] ": "\
                [join $missing {*}[if {[llength $missing] > 2} {list ", "}]]]
    }

    # Try to allocate one argument to each optional, non-catchall parameter,
    # until there are no arguments left.
    if {$count} {
        foreach name $order {
            if {![dict exists $def $name required]
             && ![dict exists $def $name catchall]} {
                dict set alloc $name 1
                if {![incr count -1]} {
                    break
                }
            }
        }
    }

    # Process excess arguments.
    if {$count} {
        if {[info exists catchall]} {
            # Allocate remaining arguments to the catchall parameter.
            dict incr alloc $catchall $count
        } elseif {[dict exists $def {}]} {
            # If there is no catchall parameter, instead allocate to the default
            # pass-through result key.
            lappend order {}
            dict set alloc {} $count
        } else {
            return -code error "too many arguments"
        }
    }

    # Check constraints.
    dict for {name opt} $def {
        if {[dict exists $opt present]} {
            foreach {match condition description} {
                1 require requires 0 forbid "conflicts with"
            } {
                if {[dict exists $opt $condition]} {
                    foreach otherName [dict get $opt $condition] {
                        if {[dict exists $def $otherName present] != $match} {
                            foreach var {name otherName} {
                                if {[dict exists $def [set $var] switch]} {
                                    set $var -[set $var]
                                }
                            }
                            return -code error "$name $description $otherName"
                        }
                    }
                }
            }
        }
    }

    # Store parameters in result dict.
    set i 0
    foreach name $order {
        if {[dict exists $alloc $name]} {
            if {![dict exists $def $name catchall] && $name ne {}} {
                set val [lindex $params $i]
                if {[dict exists $def $name pass]} {
                    dict lappend result [dict get $def $name pass] $val
                }
                incr i
            } else {
                set step [dict get $alloc $name]
                set val [lrange $params $i [expr {$i + $step - 1}]]
                if {[dict exists $def $name pass]} {
                    dict lappend result [dict get $def $name pass] {*}$val
                }
                incr i $step
            }
            if {[dict exists $def $name key]} {
                dict set result [dict get $def $name key] $val
            }
        }
    }

    # Create default values for missing elements.
    dict for {name opt} $def {
        if {[dict exists $opt key]
         && ![dict exists $result [dict get $opt key]]} {
            if {[dict exists $opt default]} {
                dict set result [dict get $opt key] [dict get $opt default]
            } elseif {[dict exists $opt catchall]} {
                dict set result [dict get $opt key] {}
            }
        }
        if {[dict exists $opt pass]
         && ![dict exists $result [dict get $opt pass]]} {
            dict set result [dict get $opt pass] {}
        }
    }

    # Return result dict or store into caller variables.
    if {[info exists inline]} {
        return $result
    } else {
        dict for {key val} $result {
            if {![dict exists $upvars $key]} {
                uplevel 1 [list ::set $key $val]
            } else {
                if {[dict exists $def [dict get $upvars $key] level]} {
                    set thisLevel [dict get $def [dict get $upvars $key] level]
                } else {
                    set thisLevel $level
                }
                uplevel 1 [list ::upvar $level $val $key]
            }
        }
    }
}