Print lines between two patterns , the awk way …

Home > awk, linux, UNIX > Print lines between two patterns , the awk way …

Print lines between two patterns , the awk way …

12/10/2010 klashxx Leave a comment Go to comments

Note: My awk guide.

Example input file:

test -3
test -2
test -1
OUTPUT
top 2
bottom 1
left 0
right 0
page 66
END
test 1
test 2
test 3

The standard way ..

awk '/OUTPUT/ {flag=1;next} /END/{flag=0} flag {print}' inputFile

Output:

top 2
bottom 1
left 0
right 0
page 66

Self-explained indented code:

awk '
/OUTPUT/ {flag=1;next}        # Initial pattern found --> turn on the flag and read the next line
/END/    {flag=0}             # Final pattern found   --> turn off rhe flag
flag     {print}              # Flag on --> print the current line
' inputFile

The first optimization is to get rid of the print , in awk when a condition is true print is the default action , so when the flag is true the line is going to be echoed.

To delete de NEXT statement , in order o prevent printing the TAG line, we need to activate the flag after the “OUTPUT” pattern discovery and after the flag evaluation.

A slight variation of the program flow and we’re done:

awk '/END/{flag=0}flag;/OUTPUT/{flag=1}' inputFile

PD: What if we only want to print the lines enclosed between the OUTPUT && END tags ? check this …

Categories: awk, linux, UNIX Tags: awk, filter, pattern, regex, unix

Comments (28) Trackbacks (2) Leave a comment Trackback

Anonymous coward

05/10/2012 at 00:56

Reply

sed -n ‘/OUTPUT/,/END\ related/p’

that should the same job, only more elegantly.
klashxx
11/10/2012 at 19:02

Reply
Well , not exactly the same:
```
# sed -n  '/OUTPUT/,/END/p' infile
OUTPUT
top 2
bottom 1
left 0
right 0
page 66
END
```
To exclude the tags you must go a little further:
```
# sed -n '1,/OUTPUT/!{ /END/,/OUTPUT/!p; }' infile 
top 2
bottom 1
left 0
right 0
page 66
```
See: http://sed.sourceforge.net/sedfaq4.html#s4.24

We will have the same inconvenients .

I still prefer awk for excluding the tags.

Cheers and thanks for the feedback.
Sarathi

17/10/2012 at 14:52

Reply

Is there any way to eliminate repetition. For Eg. if i am having a string like below and i need only the first pattern of string between OUTPUT to END

OUTPUT
top 2
bottom 1
left 0
END
right 0
OUTPUT
page 66
test 1
test 2
test 3
END
klashxx
17/10/2012 at 19:44

Reply
Of course, just use the exit statement when finding the first “END” tag.
```
$ awk '/END/{exit}flag;/OUTPUT/{flag=1}' inputFile
top 2
bottom 1
left 0
```
- Sarathi
  
  18/10/2012 at 10:20
  
  Reply
  
  Thank you so much klashxx that helped………
vishnu ram

18/10/2012 at 10:23

Reply

Then, how can we get the lines that are in between the next same two strings?
- klashxx
  
  18/10/2012 at 21:37
  
  Reply
  
  Please describe what you are trying to accomplish with a clear example
Chaitanya vemuru

20/12/2012 at 15:27

Reply

is there any way to print the lines between the patterns after some pattern is matched
eg:
xyz
abc
asd
sdf
fghj
kje
dnsk

i need a script which will have to search in the file whether it has ” xyz” and “abc” are in contiguous lines and if it has then needs to print the text between “asd” & “dnsk”.

pls respond to this ques asap….
- klashxx
  20/12/2012 at 23:43
  
  Reply
  This way:
  
  $ cat file xyz abc asd sdf fghj kje dnsk
  
  $ awk '/dnsk/{exit}flag;/xyz/{c=NR}/abc/&&NR==(c+1){flag=1}' file asd sdf fghj kje
Christof R
15/02/2013 at 14:24

Reply
Hi. Thanks for this blog entry. I stumbled across on my way to search for an extension of multimarkdown. I would like to define blocks:
```
text
text
text

~~~~ ID
Lorem ipsum dolor 
sit amet, consectetur 
adipiscing elit.
~~~~

text
text
text
```
and transform it to latex like:
```
text
text
text

\begin{ID}
Lorem ipsum dolor 
sit amet, consectetur 
adipiscing elit.
\end{ID}

text
text
text
```
I thought what you showed here would bring me there but I couldn’t figure out how. Any hints?

BR
- klashxx
  18/02/2013 at 23:41
  
  Reply
  Hi Christof, for your problem i would use a perl one-liner to perform an inplace replacement:
  
  #cat inputFile text text text ~~~~ ID Lorem ipsum dolor sit amet, consectetur adipiscing elit. ~~~~ text text text
  
  #perl -pi -e 's/^~{4}\s+ID/\\begin{ID}/g;s/^~{4}\s*(?!ID)/\\end{ID}/g' inputFile
  
  #cat inputFile text text text \begin{ID} Lorem ipsum dolor sit amet, consectetur adipiscing elit. \end{ID} text text
  - Christof R
    
    19/02/2013 at 18:17
    
    Hi and thanks for your answer. This is really concise, however, I did not clarify that ID is a variable string. So I have to read it somehow at the start of the block and use it again at the end. That got me down…
  - klashxx
    21/02/2013 at 08:51
    Ok , i got you , there’re many ways to solve your problem , give this a try;
    
    awk '/^~~~~/{if ($2!=""){s=$2};$0= $2!="" ? "\\begin{"s"}" : "\\end{"s"}"}1' inputFile
  - DI Christof Rath
    
    22/02/2013 at 21:53
    
    Perfekt. Thank you again.
  - bimleshsharma
    
    08/09/2013 at 16:54
    
    Actually i wanted this with one condition:
    log file:
    asd
    START
    as
    erg
    ege
    4t
    END
    lgjlkej
    nelgkl
    START
    lrkglk
    egiorgklk
    gljegj
    google
    lwekglk
    END
    
    So i need the lines between START and END having ‘google’ under that. Please help.
  - klashxx
    09/09/2013 at 08:52
    This solution use an array ,it rewinds the index if the “google” pattern is not present between the tags , so having this text file:
    
    asklasja asas START google saas END asd START as asassa da erg ege 4t END lgjlkej nelgkl START lrkglk egiorgklk gljegj google lwekglk END assa asas
    
    The code will be:
    
    awk '/END/ {flag=0;if(x){L=j}else{j=L};x=0} /google/{x=1} flag {a[++j]=$0;next} /START/ {flag=1} END {for (i=1;i<=j;i++){print a[i]}}' infile
    
    And the result:
    
    google saas lrkglk egiorgklk gljegj google lwekglk
justanotherhumanoid

17/09/2013 at 14:15

Reply

Reblogged this on justanotherhumanoid and commented:
Beautiful display of AWK craftsmanship. Dont miss the solutions in the comments section.
- Cithosi
  
  12/05/2014 at 10:49
  
  Reply
  
  Hi,
  with similar senario, I need the output include the START and END string, output will be like
  
  START
  google
  saas
  END
  START
  lrkglk
  egiorgklk
  gljegj
  google
  lwekglk
  END
  
  Please help
  Thanks
  Cithosi
  - klashxx
    12/05/2014 at 11:42
    Hello ,just change the matching order in awk in order to set a positive flag before the “printing”:
    
    awk '/START/{flag=1}flag;/END/{flag=0}' infile
    
    Or use this concise sed:
    
    sed -n '/START/,/END/p' infile
    
    Is up to you , but I suppose sed performance will be slightly better for large files.
  - Cithosi
    
    12/05/2014 at 12:27
    
    Thanks for the quick reply, I need to print only if “google” exist within the search block please,my file have over 3000 lines of text.
    
    Thanks
  - klashxx
    12/05/2014 at 13:38
    Ok, having this sample file:
    
    asas START weqeq eqwe eqwe google END eer START ccc ccc ccc END assa sas START zzzz google END lll
    
    You can apply this gawk (most Linux , if not you will need to delete the elements of the array one-by-one)
    
    gawk '/START/ {flag=1;found=j=0;delete a} # Beginning pattern -> inicialization flag {a[++j]=$0} # If flag store line in array /google/ && flag{found=1} # If google & flag set found as true /END/ {flag=0 # Ending pattern & found show our array if(found){for (i=1;i<=j;i++){ print a[i]}}}' infile
    
    The result wiil be:
    
    START weqeq eqwe eqwe google END START zzzz google END
    
    Machinery is simple: store the text between tags but show only if pattern is found.
  - Cithosi
    
    12/05/2014 at 14:49
    
    It worked, much appriciated,
    
    Thanks
Rahul

23/12/2013 at 19:55

Reply

Hi Klashxx, thanks for this incredibly helpful post. I have one question though. I want to input the Begin tag as a command line argument. This is what I tried –
awk ‘BEGIN{a=’$1′}/END/{exit}flag;/a/{flag=1}’ text.txt
but doesn’t work. Whats the solution?
- klashxx
  24/12/2013 at 08:33
  
  Reply
  Hi Rahul, you can go the standard path, using the -v flag:
  
  # awk -v pat="OUTPUT" '/END/{exit}flag;match($0,"^"pat){flag=1}' text.txt
  
  Or the tricky way:
  
  # pat="OUTPUT" # awk '/END/{exit}flag;/'${pat}'/{flag=1}' text.txt
  
  See Passing values to awk , the trick.
vibin

02/02/2016 at 08:11

Reply

HI Klash

Need help in getting one awk statement , awk in between two patterns and print the line if there is only one line between the pattern

I want to print abcd , that is one line in between

ex

———–
abcd
———–
efgh
ijklm
———–
- klashxx
  
  10/02/2016 at 12:50
  
  Reply
  
  Having:
  
  $ cat ex --- abcd --- efgh ijklm --- xxgfgh ---
  This code will do the trick:
  $ awk '/^---/{if(i==1){print a[i]};i=0;next}{a[++i]=$0}' ex abcd xxgfgh
shiv mohan

30/06/2016 at 06:37

Reply

Hi,

I want to print the lines between the markers patterns, but along with that i also want to print the lines matching other patterns as well.
For example: If my input file is—
START
as
erg
ege
END
abc
xyz
pqr
xyz
asd
NOW
lrkglk
egiorgklk
gljegj
google
NOT

I want to print lines between START and END,NOW and NOT, and the lines matching xyz. i.e. my output should be:
START
as
erg
ege
END
xyz
xyz
NOW
lrkglk
egiorgklk
gljegj
google
NOT

Please help me. It’s very urgent!!!
- klashxx
  
  30/06/2016 at 07:47
  
  Reply
  
  awk ‘/START|NOW/{flag=1}flag||/xyz/;/END|NOT/{flag=0}’ file