Writing Optimal Code with TJC
Writing optimal Tcl code is a not a black art. A developer need only
understand a few Tcl evaluation basics in order to avoid usage problems
that lead to non-optimal code. The TJC compiler is able to generate
optimized code for most common Tcl command usage. The page describes
some of the usage problems that would keep the TJC compiler from
generating optimized code.
Braced Expressions
By far, the most important thing a developer can do is check that each
math expression in a Tcl script is brace quoted. The expr command accepts math
expression arguments. A math expression can also be found in an if, for, or while command. An unbraced
expression cannot be compiled and will execute very slowly
since the entire expression will be reparsed each time the command is
executed. A brace quoted expression argument can be compiled and
optimized. The following example shows an unbraced and a braced
expression.
- expr $a < 3
- expr {$a < 3}
The unbraced expr (1)
passes three arguments to the expr command at runtime. In this example,
assume the variable a is
set to 1. The unbraced expr command would be called with the argument
strings "1", "<", and "3". These arguments would then be
concatenated into the string "1 < 3" and parsed into an operator
tree structure like the following:
<
/ \
1 3
The operator tree is then evaluated at runtime and the logical value 1
is returned since the left operand is smaller than the right operand.
The braced expr (2) has a
single brace quoted argument. The braced expr can be compiled and
inlined by TJC, and will not invoke the expr command at runtime. This
braced expr is parsed into an operator tree structure like the
following:
<
/ \
$a 3
The important difference to note here is that the variable a is not replaced with the
string "1" in this case. This is important because the compiler is able
to tell that there is only one operator and than the left operand is a
value contained in the variable a.
The compiler is able to inline the variable access, the operator logic,
and the constant integer operand 3. The braced expr executes more
quickly because the slow process of parsing the expression into a tree
is avoided at runtime.
The unbraced expr (1)
can't be parsed into a tree at compile time because it is impossible to
know what the variable might evaluate to. For example, the variable a could be set to a string that
evaluates to an additional operator.
% set a "2 + 2"
2 + 2
% expr $a < 3
0
While both expr command
usages described above will produce the same results, the compiled
version will execute much more quickly. The rule of thumb to remember
is that every expression should be enclosed in brace quote characters {}.
The same logic applies to expression arguments to the if, for, or while commands.
- if "$a < 3" {puts "a is less than 3"}
- if {$a < 3} {puts "a is less than 3"}
The unbraced if (1)
command would not be compiled while the braced if (2) command would be
compiled. The unbraced if (1)
command would execute significantly more slowly when compare to the
braced if (2) command.
The only exception to this rule is an expression that is a constant
boolean value (like 0, 1, true, false) need not be braced in order to
be compiled. This is supported so that either of the following usages
will eliminate dead code at compile time.
if 0 {
never_call_1
}
if {0} {
never_call_2
}
Braced Script Arguments
In general, commands that accept scripts as arguments (catch, expr, for, foreach, if,
switch, and while) can only be compiled when each of the script
arguments is brace quoted. For example, the following for loop cannot
be compiled.
set script "incr i"
for {set i 0} {$i < 10} $script {
puts "i is $i"
}
The type is usage is valid Tcl code, but it is not common usage
and cannot be compiled.
The foreach Command
Use of the foreach
command is significantly optimized by the TJC compiler. The most common
usage of foreach is to loop over a single list using a single variable:
set l {1 2 3 4}
foreach v $l {
puts "v is $v"
}
The foreach
command can also be used to loop over multiple elements in a list. The
following example shows how one might implement looping over multiple
elements with a for
command and how the same thing can be accomplished with a foreach command.
set l {1 2 3 4}
set len [llength $l]
for {set i 0} {$i <
$len} {incr i 2} {
set v1 [lindex $l $i]
set v2 [lindex $l
[expr {$i + 1}]]
puts
"v1 is $v1"
puts "v2 is $v2"
}
foreach {v1 v2} $l {
puts "v1 is $v1"
puts "v2 is $v2"
}
The foreach command can
also loop over multiple lists with a single loop. The following example
shows how one might implement looping over multiple lists with a for command and how and how the
same thing can be accomplished with a foreach command.
set l1 {1 2 3 4}
set l2 {-1 -2 -3 -4}
set len [llength $l1]
for
{set i 0} {$i < $len} {incr i} {
set v1 [lindex $l1 $i]
set v2 [lindex $l2
$i]
puts
"v1 is $v1"
puts "v2 is $v2"
}
foreach v1 $l1 v2 $l2 {
puts "v1 is $v1"
puts "v2 is $v2"
}
Using a foreach command
in situations like those shown above is always going to execute more
quickly than a for
command because the TJC compiler contains specific optimizations to
cover these usages.
Be aware that in order to compile a foreach command, the variable
name argument(s) must be either a constant string or a constant list.
The following usages are NOT
recommended. The variable name(s) are not known at compile time, so the
loop body would not get compiled and the resulting code would execute
very slowly.
foreach $varname {1 2 3 4 5} {
# Very slow!
}
foreach arr($key) {1 2 3 4 5} {
# Very slow!
}
The incr Command
Use of the incr
command is significantly optimized by the TJC compiler. The incr command supports both
array and scalar variables. Be aware that incr executes significantly
faster when used with scalar variables. For example:
for {set i 0} {$i < 1000}
{incr i} {
# Do something
}
The loop above would not execute as quickly if an array variable
was used as the loop var:
for
{set arr(i) 0} {$arr(i) < 1000} {incr arr(i)}
{
# Do something
}
Constant Values In Loops
It is generally a good idea to initialize a local variable to the value
of an expression that will remain constant during a loop instead of
evaluating the expression each time the loop is executed. In the
following example, the llength
command is called each time the loop is executed.
set l {1 2 3 4 5 6}
for {set i 2} {$i < [llength
$l]} {incr i} {
puts "value is [lindex $l $i]"
}
The value returned by llength
will always be 6, so it is more efficient to invoke this command just
once before the loop begins.
set l {1 2 3 4 5 6}
set len [llength $l]
for {set i 2} {$i <
$len} {incr i} {
puts "value is [lindex $l $i]"
}
The switch Command
The switch command
supports two usages that are legal in Tcl code.
switch $string {
"Foo" {puts "matched Foo"}
"Bar" {puts
"matched Bar"}
}
switch $str \
"Foo" {puts "matched Foo"} \
"Bar" {puts
"matched Bar"}
The first usage supports a static list of pattern/scripts
elements and is the most common way switch is used. In the example
above, the second usage is identical. The second usage exists because
the developer might want to match against patterns that are not
constant strings. For example:
set pat1 "Foo"
set pat2 "Bar"
switch $str \
$pat1 {puts "matched Foo"} \
$pat2 {puts "matched Bar"}
Both of these switch command usages are compiled by TJC. The only thing
to be aware of is that in either case the matched string cannot start
with the '-' character. This is because the switch command also accepts
option arguments like -glob,
-regexp, -exact, and --. Typically, the user would not want an
error to be generated if the string being matched just happened to
start with a '-' character. This error condition can be avoided by
adding -- just after the switch command.
switch -- $string {
"Foo" {puts "matched Foo"}
"Bar" {puts
"matched Bar"}
}
Adding -- as the second argument is the safest way to use the switch
command. TJC will also generate slightly better code when the --
argument is used with a switch command.
A TJC compiled switch
command is optimized for constant string patterns. Using a switch command to compare a
string to a number of constant string values is always going to be
faster than using an if/elseif
command.
switch -- $string {
"Foo" {puts "matched Foo"}
"Bar" {puts "matched Bar"}
"Baz" -
"Zaz" {puts
"matched Baz or Zaz"}
}
if {$string == "Foo"} {
puts "matched Foo"}
} elseif {$string == "Bar"} {
puts "matched bar"}
} elseif {$string == "Baz" || $string == "Zaz"} {
puts
"matched Baz or Zaz"
}
The lappend Command
The lappend
command is significantly optimized by the TJC compiler, but some usage
is better than others. For example, the following code is significantly
optimized.
catch {unset mylist}
set mylist [list]
lappend mylist A B C
The lappend
command supports appending to a variable that is not yet set,
but this usage is not optimized. For example, the following lappend
usage would not execute as quickly as the example above.
catch {unset mylist}
lappend mylist
A B C
The lappend
command also supports initializing a variable value to an
empty list, but this usage is not optimized. The following code would
also execute less quickly that the first example given above.
catch {unset mylist}
lappend mylist
lappend mylist
A B C
The lappend
command supports passing multiple values to be appended. It is more
efficient to pass multiple items to be appended to a single lappend command as opposed to
using multiple lappend
commands. For example, the following code:
catch {unset mylist}
set mylist [list]
lappend mylist A B \
REAL_LONG_ELEMENT
Will produce more
efficient code than the following:
catch {unset mylist}
set mylist [list]
lappend mylist A
lappend mylist B
lappend
mylist REAL_LONG_ELEMENT
The lappend
command supports appending to an array variable, as in the following
example:
set arr(elems) [list]
lappend arr(elems)
A B C
While using lappend
with array variables is optimized by TJC, it is much faster to use the lappend command with scalar
variables. Using a scalar will not make much difference with just one lappend command, but a number
of lappend command in a
loop would execute more quickly if a scalar local variable is used
instead of an array variable. For example, the loop:
set arr(elems) [list]
foreach var {1 2 3 4 5 6 7 8 9 10} {
lappend arr(elems)
$var
}
Would execute more quickly when rewritten as:
set l [list]
foreach var {1 2 3 4 5 6 7 8 9 10} {
lappend l
$var
}
set arr(elems) $l
The append Command
The append
command is significantly optimized by the TJC compiler, but some usage
is better than others. For example, the following code is
significantlyoptimized.
catch {unset myvar}
set myvar ""
append myvar STR1 STR2 STR3
The append
command supports appending to a variable that is not yet set,
but this usage is not optimized. For example, the following append
usage would not execute as quickly as the example above.
catch {unset myvar}
append myvar
STR1
STR2 STR3
The append
command supports passing multiple values to be appended. It is more
efficient to pass multiple items to be appended to a single append command as opposed to
using multiple append
commands. For example, the following code:
catch {unset myvar}
set myvar ""
append myvar
STR1
STR2 \
REAL_LONG_STRING
Will produce more efficient code than the following:
catch {unset myvar}
set myvar ""
append myvar STR1
append myvar STR2
append
myvar REAL_LONG_STRING
The append
command supports appending to an array variable, as in the following
example:
set arr(str) ""
append arr(str)
HELLO
append
arr(str)
" "
append
arr(str)
THERE
While supported, append operations on array
variables can't be optimized and will execute very slowly. Developers
should use append
exclusively with scalar variables. The example given above should be
rewritten as:
set str ""
append str HELLO
append
str "
"
append
str THERE
set arr(str) $str
The list Command
The list
command is optimized for use with many common list operation. The
following example shows the optimal way to initialize a variable for
use with lappend or
any other list operation.
catch {unset mylist}
set mylist [list]
lappend mylist A B C
The example code above is more efficient than the following two
examples. Both of the examples below will initialize the variable to an
empty string instead of an empty list.
catch
{unset mylist}
set mylist ""
lappend mylist A B C
catch {unset mylist}
set mylist {}
lappend mylist A B C
All three of the examples above will produce the
exact same results, but the first example that makes explicit use of
the list command to
initialize the variable will execute more quickly.