It explains all parts of your regex and highlights all matches in your example text. I usually add a comment to a regex101 playground if I use a regex in code.
I am, and I can think of many cases where plain dumb string matching since you know what you’re dealing with beats regex in both performance and maintainability.
You’re a clown that wouldn’t know how to compare two strings without regex even if you got paid 6 figures to do it.
There’s a lot of use cases where regex makes a lot of sense: complex log parsing, determining if a value entered is a valid phone number or email, syntax highlighting, data validation in ML preprocessing, etc. A lot of languages also come with certain features that allow regex to be more efficient than dumb string matching, such as the ability to pre-compile patterns and the flexibility of being able to choose between deterministic and non-deterministic finite automata, should you need efficiency for one use case and flexibility for another. It really depends on what you’re designing and how it’s going to be used, of course.
It’s pattern-matching. Like searching *.txt to get all text files. It’s just… more. There’s symbols for matching the start of a string, the end of a string, a set of characters, repetition, etc. Very “etc.” And the syntax blows. The choices of . for match-any-character and * for zero-or-more really fuck with common expectations.
It can also replace substrings that match. Like changing the file extension of all text files. Where it gets properly difficult is in “capture groups.” Like looking for all file extensions, and sticking a tilde after the dot. You can put parentheses around part of the pattern being matched and then reference that in the replacement. Conceptually simple - pain in the ass to use properly - syntax both sucks and blows.
Lookahead is what you do to match “ass” but not “assault.” I refuse to elaborate further.
My guess is, that someone started with a small share of features to find a simple solution for the problem, but the complexity of the problem got waaaay out of hand.
Regexes are actually used in formal computer science (if that’s the right term), i.e. “proof that this and that algorithm won’t deadlock” or something like that.
They’re actually really elegant and can cover a lot. But you’ll have to learn them by using them.
For the purpose of algorithm verification, the final and/or pushdown automaton or probably sometimes even Turing Machines are used, because they are easier to work with. “Real” regular expressions are only nice to write a grammar for regular languages which can be easily interpreted by the computer I think. The thing is, that regexs in the *nix and programming language world are also used for searching which is why there are additional special characters to indicate things like: “it has to end with …” and there are shortcuts for when you want that a character or sequence occurs
at least once,
once or never or
a specified number of times
back to back.
In “standard” regex, you would only have
() for grouping,
* for 0 or any number of occurances (so a* means blank or a or aa or …)
+ as combining two characters/groups with exclusive or (in programming, a+ is mostly the same as aa* so this is a difference)
and sometimes some way to have a shortcut for (a+b+c+…+z) if you want to allow any lower case character as the next one
So there are only 4 characters which have the same expressive power as the extended syntax with the exception of not being able to indicate, that it should occur at the end or beginning of a string/line (which could even be removed if one would have implemented different functions or options for the tools we now have instead)
I still don’t understand regex at all
The plural of regex is regrets.
I recommend using https://regex101.com/
It explains all parts of your regex and highlights all matches in your example text. I usually add a comment to a regex101 playground if I use a regex in code.
deleted by creator
I disagree. I found it to be incredibly useful when I knew what regex was and that I needed it, but I couldn’t piece together a single string with it.
Regexes are write-only. No one can understand other peoples regexes
Not only other people’s regexes. Mine from last week and before too
Haha
That’s also 'cause other people’s regex are garbage!
That’s why they don’t belong anywhere outside single line bash commands.
deleted by creator
Use of regex in production code is a sign of incompetence.
deleted by creator
He’s a troll. He said earlier today that the holocaust wasn’t bad because “not all the jews died”
He’s just trying to pick a fight
I am, and I can think of many cases where plain dumb string matching since you know what you’re dealing with beats regex in both performance and maintainability.
You’re a clown that wouldn’t know how to compare two strings without regex even if you got paid 6 figures to do it.
There’s a lot of use cases where regex makes a lot of sense: complex log parsing, determining if a value entered is a valid phone number or email, syntax highlighting, data validation in ML preprocessing, etc. A lot of languages also come with certain features that allow regex to be more efficient than dumb string matching, such as the ability to pre-compile patterns and the flexibility of being able to choose between deterministic and non-deterministic finite automata, should you need efficiency for one use case and flexibility for another. It really depends on what you’re designing and how it’s going to be used, of course.
deleted by creator
It’s pattern-matching. Like searching
*.txt
to get all text files. It’s just… more. There’s symbols for matching the start of a string, the end of a string, a set of characters, repetition, etc. Very “etc.” And the syntax blows. The choices of.
for match-any-character and*
for zero-or-more really fuck with common expectations.It can also replace substrings that match. Like changing the file extension of all text files. Where it gets properly difficult is in “capture groups.” Like looking for all file extensions, and sticking a tilde after the dot. You can put parentheses around part of the pattern being matched and then reference that in the replacement. Conceptually simple - pain in the ass to use properly - syntax both sucks and blows.
Lookahead is what you do to match “ass” but not “assault.” I refuse to elaborate further.
My guess is, that someone started with a small share of features to find a simple solution for the problem, but the complexity of the problem got waaaay out of hand.
Regexes are actually used in formal computer science (if that’s the right term), i.e. “proof that this and that algorithm won’t deadlock” or something like that.
They’re actually really elegant and can cover a lot. But you’ll have to learn them by using them.
For the purpose of algorithm verification, the final and/or pushdown automaton or probably sometimes even Turing Machines are used, because they are easier to work with. “Real” regular expressions are only nice to write a grammar for regular languages which can be easily interpreted by the computer I think. The thing is, that regexs in the *nix and programming language world are also used for searching which is why there are additional special characters to indicate things like: “it has to end with …” and there are shortcuts for when you want that a character or sequence occurs
In “standard” regex, you would only have
So there are only 4 characters which have the same expressive power as the extended syntax with the exception of not being able to indicate, that it should occur at the end or beginning of a string/line (which could even be removed if one would have implemented different functions or options for the tools we now have instead)
So one could say that *nix regex is bloated /s
//////?-.,", duh,?!
Nobody does.