Home > awk, linux, perl, text processing, UNIX > Print lines between two patterns revisited , perl to help

Print lines between two patterns revisited , perl to help


This is a very common task on text processing.

Remenber our awk example?

That was a handy one , but what if the oppening tag tab is not closed?

# cat inputFile
test -3
test -2
test -1
OUTPUT
top 2
bottom 1
left 0
right 0
page 66
END
test 1
OUTPUT
test 2
test 3
test -3
test -2
# awk '/END/{flag=0}flag;/OUTPUT/{flag=1}' inputFile
top 2
bottom 1
left 0
right 0
page 66
test 2
test 3
test -3
test -2

As expected the lines bellow the unclosed 2nd OUTPUT tag are printed , that’s could be probably be fine …

But what if we only want to print the lines enclosed between the OUTPUT && END tags ?

Most times we face problems in a very “straight logical” way :

read lines -> find OPENING TAG -> Store lines ’till find CLOSING TAG -> print lines and release the storage … and so on

Make sense? yes.. but what if we use reverse logic?

# cat inputFile
test -3
test -2
test -1
OUTPUT
top 2
bottom 1
left 0
right 0
page 66
END
test 1
OUTPUT
test 2
test 3
test -3
test -2
# perl -e 'print reverse <>' inputFile | awk '/OUTPUT/{flag=0}flag;/END/{flag=1}' |perl -e 'print reverse <>'
top 2
bottom 1
left 0
right 0
page 66

The “creative” way:

reverse the file -> use our old awk -> reverse de the results

Easy and quick!

Of course this could be done in one perl call :

# perl -e 'foreach (reverse <>){if (/END/ .. /OUTPUT/){push (@a,$_) unless /END|OUTPUT/}};print reverse(@a)' inputFile
top 2
bottom 1
left 0
right 0
page 66
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: