Writing Optimal Code with TJC


Writing optimal Tcl code is a not a black art. A developer need only understand a few Tcl evaluation basics in order to avoid usage problems that lead to non-optimal code. The TJC compiler is able to generate optimized code for most common Tcl command usage. The page describes some of the usage problems that would keep the TJC compiler from generating optimized code.

Braced Expressions

By far, the most important thing a developer can do is check that each math expression in a Tcl script is brace quoted. The expr command accepts math expression arguments. A math expression can also be found in an if, for, or while command. An unbraced expression cannot be compiled and will execute very slowly since the entire expression will be reparsed each time the command is executed. A brace quoted expression argument can be compiled and optimized. The following example shows an unbraced and a braced expression.

  1. expr $a < 3
  2. expr {$a < 3}

The unbraced expr (1) passes three arguments to the expr command at runtime. In this example, assume the variable a is set to 1. The unbraced expr command would be called with the argument strings "1", "<", and "3". These arguments would then be concatenated into the string "1 < 3" and parsed into an operator tree structure like the following:

    <
   / \
  1   3

The operator tree is then evaluated at runtime and the logical value 1 is returned since the left operand is smaller than the right operand.

The braced expr (2) has a single brace quoted argument. The braced expr can be compiled and inlined by TJC, and will not invoke the expr command at runtime. This braced expr is parsed into an operator tree structure like the following:

    <
   / \
  $a   3

The important difference to note here is that the variable a is not replaced with the string "1" in this case. This is important because the compiler is able to tell that there is only one operator and than the left operand is a value contained in the variable a. The compiler is able to inline the variable access, the operator logic, and the constant integer operand 3. The braced expr executes more quickly because the slow process of parsing the expression into a tree is avoided at runtime.

The unbraced expr (1) can't be parsed into a tree at compile time because it is impossible to know what the variable might evaluate to. For example, the variable a could be set to a string that evaluates to an additional operator.

% set a "2 + 2"
2 + 2
% expr $a < 3
0

While both expr command usages described above will produce the same results, the compiled version will execute much more quickly. The rule of thumb to remember is that every expression should be enclosed in brace quote characters {}.

The same logic applies to expression arguments to the if, for, or while commands.

  1. if "$a < 3" {puts "a is less than 3"}
  2. if {$a < 3} {puts "a is less than 3"}
The unbraced if (1) command would not be compiled while the braced if (2) command would be compiled. The unbraced if (1) command would execute significantly more slowly when compare to the braced if (2) command.

The only exception to this rule is an expression that is a constant boolean value (like 0, 1, true, false) need not be braced in order to be compiled. This is supported so that either of the following usages will eliminate dead code at compile time.

if 0 {
    never_call_1
}
if {0} {
    never_call_2
}

Braced Script Arguments

In general, commands that accept scripts as arguments (catch, expr, for, foreach, if, switch, and while) can only be compiled when each of the script arguments is brace quoted. For example, the following for loop cannot be compiled.

set script "incr i"
for {set i 0} {$i < 10} $script {
    puts "i is $i"
}

The type is usage is valid Tcl code, but it is not common usage and cannot be compiled.

The foreach Command

Use of the foreach command is significantly optimized by the TJC compiler. The most common usage of foreach is to loop over a single list using a single variable:

set l {1 2 3 4}
foreach v $l {
    puts "v is $v"
}

The foreach command can also be used to loop over multiple elements in a list. The following example shows how one might implement looping over multiple elements with a for command and how the same thing can be accomplished with a foreach command.

set l {1 2 3 4}

set len [
llength $l]
for {set i 0} {$i < $len} {incr i 2} {
    set v1 [lindex $l $i]
    set v2 [lindex $l [expr {$i + 1}]]
    puts "v1 is $v1"
    puts "v2 is $v2"
}

foreach {v1 v2} $l {
    puts "v1 is $v1"
    puts "v2 is $v2"

}

The foreach command can also loop over multiple lists with a single loop. The following example shows how one might implement looping over multiple lists with a for command and how and how the same thing can be accomplished with a foreach command.

set l1 {1 2 3 4}
set l2 {-1 -2 -3 -4}

set len [
llength $l1]
for {set i 0} {$i < $len} {incr i} {
    set v1 [lindex $l1 $i]
    set v2 [lindex $l2 $i]
    puts "v1 is $v1"
    puts "v2 is $v2"
}

foreach v1 $l1 v2 $l2 {
    puts "v1 is $v1"
    puts "v2 is $v2"

}

Using a foreach command in situations like those shown above is always going to execute more quickly than a for command because the TJC compiler contains specific optimizations to cover these usages.

Be aware that in order to compile a foreach command, the variable name argument(s) must be either a constant string or a constant list. The following usages are NOT recommended. The variable name(s) are not known at compile time, so the loop body would not get compiled and the resulting code would execute very slowly.

foreach $varname {1 2 3 4 5} {
    # Very slow!
}
foreach arr($key) {1 2 3 4 5} {
    # Very slow!
}

The incr Command

Use of the incr command is significantly optimized by the TJC compiler. The incr command supports both array and scalar variables. Be aware that incr executes significantly faster when used with scalar variables. For example:

for {set i 0} {$i < 1000} {incr i} {
    # Do something
}

The loop above would not execute as quickly if an array variable was used as the loop var:

for {set arr(i) 0} {$arr(i) < 1000} {incr arr(i)} {
    # Do something
}

Constant Values In Loops

It is generally a good idea to initialize a local variable to the value of an expression that will remain constant during a loop instead of evaluating the expression each time the loop is executed. In the following example, the llength command is called each time the loop is executed.

set l {1 2 3 4 5 6}
for {set i 2} {$i < [llength $l]} {incr i} {
    puts "value is [lindex $l $i]"
}

The value returned by llength will always be 6, so it is more efficient to invoke this command just once before the loop begins.

set l {1 2 3 4 5 6}
set len [llength $l]
for {set i 2} {$i < $len} {incr i} {
    puts "value is [lindex $l $i]"
}

The switch Command

The switch command supports two usages that are legal in Tcl code.

switch $string {
    "Foo" {puts "matched Foo"}
    "Bar" {puts "matched Bar"}
}

switch $str \
    "Foo"
{puts "matched Foo"} \
    "Bar" {puts "matched Bar"}

The first usage supports a static list of pattern/scripts elements and is the most common way switch is used. In the example above, the second usage is identical. The second usage exists because the developer might want to match against patterns that are not constant strings. For example:

set pat1 "Foo"
set pat2 "Bar"
switch $str \
    $pat1 {puts "matched Foo"} \
    $pat2 {puts "matched Bar"}

Both of these switch command usages are compiled by TJC. The only thing to be aware of is that in either case the matched string cannot start with the '-' character. This is because the switch command also accepts option arguments like -glob, -regexp, -exact, and --. Typically, the user would not want an error to be generated if the string being matched just happened to start with a '-' character. This error condition can be avoided by adding -- just after the switch command.

switch -- $string {
    "Foo" {puts "matched Foo"}
    "Bar" {puts "matched Bar"}
}

Adding -- as the second argument is the safest way to use the switch command. TJC will also generate slightly better code when the -- argument is used with a switch command.

A TJC compiled switch command is optimized for constant string patterns. Using a switch command to compare a string to a number of constant string values is always going to be faster than using an if/elseif command.

switch -- $string {
    "Foo" {puts "matched Foo"}
    "Bar" {puts "matched Bar"}
    "Baz" -
    "Zaz" {puts "matched Baz or Zaz"}
}

if {$string == "Foo"} {
    puts "matched Foo"}
} elseif {$
string == "Bar"} {
    puts "matched bar"}
} elseif {$string == "Baz" || $string == "Zaz"} {
   
puts "matched Baz or Zaz"
}

The lappend Command

The lappend command is significantly optimized by the TJC compiler, but some usage is better than others. For example, the following code is significantly optimized.

catch {unset mylist}
set mylist [list]
lappend
mylist A B C

The lappend command supports appending to a variable that is not yet set, but this usage is not optimized. For example, the following lappend usage would not execute as quickly as the example above.

catch {unset
mylist}
lappend
mylist A B C

The lappend command also supports initializing a variable value to an empty list, but this usage is not optimized. The following code would also execute less quickly that the first example given above.

catch {unset
mylist}
lappend
mylist
lappend mylist A B C

The lappend command supports passing multiple values to be appended. It is more efficient to pass multiple items to be appended to a single lappend command as opposed to using multiple lappend commands. For example, the following code:

catch {unset mylist}
set mylist [list]
lappend mylist A B \
    REAL_LONG_ELEMENT

Will produce more efficient code than the following:

catch {unset mylist}
set mylist [list]
lappend mylist A
lappend mylist B
lappend mylist REAL_LONG_ELEMENT


The lappend command supports appending to an array variable, as in the following example:

set arr(elems) [list]
lappend
arr(elems) A B C

While using lappend with array variables is optimized by TJC, it is much faster to use the lappend command with scalar variables. Using a scalar will not make much difference with just one lappend command, but a number of lappend command in a loop would execute more quickly if a scalar local variable is used instead of an array variable. For example, the loop:

set arr(elems) [list]
foreach var {1 2 3 4 5 6 7 8 9 10} {
    lappend
arr(elems) $var
}

Would execute more quickly when rewritten as:

set l [list]
foreach var {1 2 3 4 5 6 7 8 9 10} {
    lappend
l $var
}
set arr(elems) $l

The append Command

The append command is significantly optimized by the TJC compiler, but some usage is better than others. For example, the following code is significantlyoptimized.

catch {unset myvar}
set myvar ""
append
myvar STR1 STR2 STR3

The append command supports appending to a variable that is not yet set, but this usage is not optimized. For example, the following append usage would not execute as quickly as the example above.

catch {unset
myvar}
append
myvar STR1 STR2 STR3

The append command supports passing multiple values to be appended. It is more efficient to pass multiple items to be appended to a single append command as opposed to using multiple append commands. For example, the following code:

catch {unset
myvar}
set
myvar ""
append myvar STR1 STR2 \
    REAL_LONG_STRING


Will produce more efficient code than the following:

catch {unset myvar}
set myvar ""
append myvar STR1
append myvar STR2
append myvar
REAL_LONG_STRING


The append command supports appending to an array variable, as in the following example:

set arr(str) ""
append
arr(str) HELLO
append arr(str) " "
append arr(str) THERE

While supported, append operations on array variables can't be optimized and will execute very slowly. Developers should use append exclusively with scalar variables. The example given above should be rewritten as:

set str ""
append str
HELLO
append str " "
append str THERE
set arr(str) $str

The list Command

The list command is optimized for use with many common list operation. The following example shows the optimal way to initialize a variable for use with lappend or any other list operation.

catch {unset mylist}
set mylist [list]
lappend
mylist A B C

The example code above is more efficient than the following two examples. Both of the examples below will initialize the variable to an empty string instead of an empty list.

catch {unset mylist}
set mylist ""
lappend
mylist A B C

catch {unset mylist}
set mylist {}
lappend
mylist A B C

All three of the examples above will produce the exact same results, but the first example that makes explicit use of the list command to initialize the variable will execute more quickly.