A pip requirements file contains all the installed Python packages, so that file can be used elsewhere and rebuild the original programming environment.
A requirements file looks like this:
alabaster==0.7.9
arrow==0.8.0
awesome-slugify==1.6.5
Babel==2.3.4
binaryornot==0.4.0
blessings==1.6
What I want is to remove the part that indicates the version, in the case of the first line alabaster==0.7.9
, remove the part ==0.7.9
and leave only alabaster
.
I understand that finding a match creates two groups, but I can't get it to work. I am trying it on ubuntu using awk as follows.
When I order the first group:
$ awk -F"==" '{print $1}' base.txt
I get this:
alabaster==0.7.9 arrow==0.8.0 awesome-slugify==1.6.5
that is, the file is repeated.
When I order the second group with
$ awk -F"==" '{print $2}' base.txt
I only get 50 blank lines.
ADDITION:
Now I search with this pattern (\w+)(==.)
with which I make two match groups, I am interested in the first one. But if the package is called python-mimeparse
there is no match anymore . You should be able to add hyphens in case some package is called paquete_python
orpaquete-python
.
Addendum 2
This expression (.+)(==)(.+)
finds three groups, the first is the package (which is what I'm looking for) and the third is the version. Now I just need to know how to use it in awk
.
third edition
I posted an answer that solves the problem in Python, but the idea is that the solution is applied with some other tool like awk
, gawk
, sed
or even perl
.
There are several options in this SOen post , but I haven't been able to use my search pattern on any of them. I get no errors, but no output either.
Some considerations:
- I am looking to get only the package name , not the version
- There is no package installed, so there is nothing to update
- The solution can use another tool, like
sed
orgrep
A. VALUES ON THE LEFT OF ==
Option 1.
Capture everything before ==
Option 2.
Make a match without capturing the group from ==
Thank you @fedorqui
DEMO
Result
B. VALUES TO THE RIGHT OF ==
=.*
DEMO
Result
The solution
awk -F'==' '{print $1}' archivo
uses a field separator ( FS ) with multicharacters. This is valid as long as you are using a version ofawk
POSIX-compliant. For example, on Solaris it won't work.So the question is: how to make it work?
So let's simplify: the file consists of lines of the form
módulo==versión
. Therefore, what we can do is delete=
and everything that follows it:This is saying: separate the line based on
=
as separator (-d=
) and print the first resulting field (-f1
).It can be a bit fragile, so you can also choose to use
sed
:This does the same thing: removes from the first symbol
=
. However it allows to extend the command to something more complex like:Which performs this substitution only on lines containing
==
. And if you push me, you can get to say:To print only these lines (
-n
disables printing by default andp
prints the current line).If you really want
match()
to use awk, use:As you can see, the syntax is
match(línea, patrón, matriz de resultados)
. Therefore, it is a matter of capturing the ones that interest us: in this case only the first one, so in fact we could limit ourselves to sayingmatch($0, /^(.+)==/, res)
, without the need to capture the rest.In short:
awk
it doesn't seem like the best solution here because depending on which environments you may have problems with the multicharacter field separator. Make your life easy usingsed
in this case: there is no need to use such complex regular expressions when ased
simple one already gives you everything you need.Try this command in Bash:
requirements.txt
would be the pip requirements file.The important thing is the regular expression to use and the one I put includes the requirement of the hyphen or underscore separator; I updated the example from @A. Cedano so you can see it live here .
If you need to save the output to a file (you probably do), you can obviously use output redirection; namely:
I hope it helps you, greetings.
The alternative in Python is as follows:
re
that handles regular expressions.pattern
or pattern with the expression we are looking for:paquete
and is formed with any character and any quantity.version
and is made up of the rest of the characters after the second group.paquete
(that is, the match ) is printed.