This directory contains some examples illustrating techniques for extracting high-performance from flex scanners. Each program implements a simplified version of the Unix "wc" tool: read text from stdin and print the number of characters, words, and lines present in the text. All programs were compiled using gcc (version unavailable, sorry) with the -O flag, and run on a SPARCstation 1+. The input used was a PostScript file, mainly containing figures, with the following "wc" counts: lines words characters 214217 635954 2592172 The basic principles illustrated by these programs are: - match as much text with each rule as possible - adding rules does not slow you down! - avoid backing up and the big caveat that comes with them is: - you buy performance with decreased maintainability; make sure you really need it before applying the above techniques. See the "Performance Considerations" section of flexdoc for more details regarding these principles. The different versions of "wc": mywc.c a simple but fairly efficient C version wc1.l a naive flex "wc" implementation wc2.l somewhat faster; adds rules to match multiple tokens at once wc3.l faster still; adds more rules to match longer runs of tokens wc4.l fastest; still more rules added; hard to do much better using flex (or, I suspect, hand-coding) wc5.l identical to wc3.l except one rule has been slightly shortened, introducing backing-up Timing results (all times in user CPU seconds): program time notes ------- ---- ----- wc1 16.4 default flex table compression (= -Cem) wc1 6.7 -Cf compression option /bin/wc 5.8 Sun's standard "wc" tool mywc 4.6 simple but better C implementation! wc2 4.6 as good as C implementation; built using -Cf wc3 3.8 -Cf wc4 3.3 -Cf wc5 5.7 -Cf; ouch, backing up is expensive |