Let's imagine that I have a file with the following fields, delimited by the token $$$$:
$$$$
> <DATABASE_ID>
HMDB0000016
> <DATABASE_NAME>
hmdb
> <FORMULA>
C21H30O3
$$$$
<DATABASE_ID>
HMDB0000017
> <DATABASE_NAME>
hmdb
> <FORMULA>
C8H9NO4
$$$$
> <DATABASE_ID>
HMDB0000020
> <DATABASE_NAME>
hmdb
> <FORMULA>
C8H8O3
$$$$
In this case, I would like to be able to get all the fields that are between any two tokens using the value of the DATABASE_ID field via an awk script. For example, suppose I choose the value associated with this field HMDB0000020
. How could I pass the value of this field to the awk script so that it prints on the screen all the fields that are between the tokens?
So far, I have managed to write a command from the command line that prints on the screen what is between the two tokens. I have achieved this command thanks to the post https://es.stackoverflow.com/a/407481/83.
Continuing with the example:
awk -v RS='\n\\$\\$\\$\\$\n' '/HMDB0000020/' fichero
> <DATABASE_ID>
HMDB0000020
> <DATABASE_NAME>
hmdb
> <FORMULA>
C8H8O3
However, I'm interested in being able to call a script using awk -f -v variable=HMDB0000020
it and have the same output, but I don't know how I could rewrite the command I have.
Any ideas?
Thanks in advance.
If you want to move the logic to an Awk file, you simply have to create a new file and, for better visibility, give it a
.awk
.For example, we can create "s.awk" with this content:
And we execute it as follows:
That is, we execute Awk through the script
s.awk
and pass the variable to itvar
using the syntax-v nombre_variable="valor variable"
, as we mentioned in How can I use shell variables in awk? . Finally,fichero
it is the content that you use to process.As for the content of the file itself, notice that I have moved the RS logic to the BEGIN block, which is the way we have to change this value without passing it when calling Awk "inline". That is,
awk -v RS="bla"
it is the same asawk 'BEGIN{RS="bla"}
and allows us to put it inside the file.Then, the part of
$0 ~ var
consists of validating that the given line has the same pattern that you pass to it through the variable "var".