Assumes Tcl 8.6 (couroutine support)
if {[catch {package req Tcl 8.6}]} return
Rosetta example: Tokenize a string with escaping
+Write a class which allows for splitting a string at each non-escaped +occurrence of a separator character.
package req nx + +nx::Class create Tokenizer { + :property s:required + :method init {} { + :require namespace + set coro [coroutine [current]::nextCoro [current] iter ${:s}] + :public object forward next $coro + } + :public method iter {s} { + yield [info coroutine] + for {set i 0} {$i < [string length $s]} {incr i} { + yield [string index $s $i] + } + return -code break + } + :public object method tokenize {{-sep |} {-escape ^} s} { + set t [[current] new -s $s] + set part "" + set parts [list] + while {1} { + set c [$t next] + if {$c eq $escape} { + append part [$t next] + } elseif {$c eq $sep} { + lappend parts $part + set part "" + } else { + append part $c + } + } + lappend parts $part + return $parts + } +}
Run some tests incl. the escape character:
% Tokenizer tokenize -sep | -escape ^ ^| +| +% Tokenizer tokenize -sep | -escape ^ ^|^| +|| +% Tokenizer tokenize -sep | -escape ^ ^^^| +^| +% Tokenizer tokenize -sep | -escape ^ | +{} {}
Test for the output required by the Rosetta example:
% Tokenizer tokenize -sep | -escape ^ one^|uno||three^^^^|four^^^|^cuatro| +one|uno {} three^^ four^|cuatro {}