In this regular expression that I have formulated, the results that match are just the opposite of what I need to get
^(?!.*\b(?:BEAR|DOWN|UP|BULL)\b).*/USDT$
Passing for example these values (one by one as text, not as list)
['BTC/USDT','BTCDOWN/USDT','ETH/BTC','BTCBULL/USDT','ADA/USDT']
The code looks like this:
import re
pairs = ['BTC/USDT','BTCDOWN/USDT','ETH/BTC','BTCBULL/USDT','ADA/USDT']
regex = "^(?!.*\b(?:BEAR|DOWN|UP|BULL)\b).*/USDT$"
for p in pairs:
if re.match(regex, p):
print (p)
What I am trying to do is keep only BTC/USDT and ADA/USDT, however with the current regex, the records with "DOWN" and "BULL" continue to appear, for example
There are two problems in your code. One is syntax, the other is already logic.
The syntax error is that the sequence
\b
within a string is interpreted as the ASCII character "Bell" and not as two separate characters\
,b
which is what you need to write a regular expression for. This is fixed by either "escaping" the slash by the method of putting a slash in front and typing\\b
, or by putting a slash inr
front of the string. This second method is cleaner and recommended in Python. If a string has ar
before it (before opening quotes), then the occurrence of the backslash within\
it is not intended to be interpreted as a special character, but is treated as one more character, which is what you need to write regular expressions.The logic error is that the first one is
\b
superfluous, because that sequence inside a regexp means "beginning of a word". By typing\b(?:BEAR|DOWN|UP|BULL)
you are indicating that one of the words "BEAR", "DOWN", etc will fit only if they are word-starts , but in your case they are not, they are word-ends.Fixing both problems your code looks like this:
and this already prints what you expected:
Being positive, you want to filter out items with BTC and ADA, followed by USDT. You can achieve this with a simple pattern:
(BTC|ADA)/USDT
.Given the conditions of the problem, this is all that is needed:
produces: