On Sun, 2 Jun 2013, Mike Miller wrote: > Pretty close, but it's sending out an extra pair of newlines for every > space in the format string. It does seem to be super fast, though, > probably faster than the perl scripts I'm testing, which is great. I was able to fix that newline problem. All I had to do was delete one line: 26d25 < print "\n" Another issue is with the way awk and sed handle the data stream -- if they only need to work with the first few lines, they still process the entire file. It would be great if it were possible to tell it to awk to stop with the last requested line. Here's an example where I send 10 million lines to the script. This is how much time it takes just to make all those lines: $ time -p seq 10000000 >/dev/null real 5.11 user 5.11 sys 0.01 Here's how long it takes to process those 10 million lines when only the first 55 lines are needed: $ time -p seq 10000000 | print_ranges.awk - "1-5 55-55 27-27" >/dev/null real 27.01 user 20.36 sys 1.25 But here's how long it takes when I add head -55 to the pipe to drop the unused lines before piping to the awk script: $ time -p seq 10000000 | head -55 | print_ranges.awk - "1-5 55-55 27-27" >/dev/null real 0.05 user 0.01 sys 0.00 My friend's perl script doesn't reorder the lines and it is much slower, but my friend is working on making it stop after the last processed line, and if that succeeds it will be much faster. $ time -p seq 10000000 | ./cutrows_King_1999.pl 1-5,55,27 1 2 3 4 5 27 55 real 56.59 user 59.56 sys 0.07 I probably should try to learn enough perl and awk to understand these scripts more completely. With what I know now, I think I could write a wrapper that reads in a string like -5,55,27 and puts out "1-5 55-55 27-27" but also uses "head -55" to reduce the work load on awk. One thing that I don't know how to do is to make it so that something like 92- gets interpreted like this... 92-$(wc -l file) ...but without having to run wc-l, which could take forever on a very big file, possibly doubling the processing time. I also have to test these things for how well they deal with stuff like 1-10,12-10000000. That is, just dropping one line out of a 10 million line file. Mike