RegEx

leobm, German
@leobm@norden.social avatar

The ?x modifier/flag is nice, never used it before.
Makes it possible to include commentary inside complicated patterns.

original source: https://polar.sh/eval/posts/named-capturing-groups-in-clojure

chrastecky,
@chrastecky@phpc.social avatar

Lo and behold, fellow mortals of the programmer variety! I find myself embarking upon a most arduous quest.

One that shall test the mettle of my coding prowess... Yes, thee heard right.

I dare to dance with the elusive and powerful entity known as... the #regex #email #validation!

heiglandreas,
@heiglandreas@phpc.social avatar

@chrastecky All right! 3 step email verification:

  1. No '@', no email address
  2. No MX entry in DNS for the part *after" the '@', no email address
  3. No response to a link in an email to the email address, No email address.

Everything else might look like an email address, but it isn't.... 🤷

heiglandreas,
@heiglandreas@phpc.social avatar

@chrastecky And while

ändi@stella.maris.solutions

looks like a valid email-address (if it doesn't, check your algo) ir is about as valid as "hello world" as an email address.

But that might just be different expectations of what a valid email address is. 🤷

When your customer is happy with above string being considered a "valid email address", then everything is fine...

lizardbill,
@lizardbill@hachyderm.io avatar

You can't parse [X]HTML with regex. https://stackoverflow.com/a/1732454/1288

SirTapTap,
@SirTapTap@mastodon.social avatar
villetakanen,
NireBryce,
@NireBryce@hachyderm.io avatar

does there yet exist any application that can take a multi-selection and spit out a #regex that will match that in any file going forward?

necrosis, German
@necrosis@chaos.social avatar

Liebe Informatik Lehrkräfte, bitte bringt euren Schülerinnen und Schülern bei.

Es hilft im Job ungemein. 😅
Ich wünschte ich hätte das schon in der Schule gelernt. 🥹

Linkshaender,
@Linkshaender@bildung.social avatar

@hobbypaedagoge ich grätsche mal rein 😉
Suchen (und Ersetzen) von Text
Parsen von Logfiles
Validieren von Daten/Eingaben (nein, keine Mailadressen!)
Daten bereinigen
Extrahieren von Infos aus Textdateien

Anwendungstipp: awk lernen.

Informatik-Unterricht: Automatentheorie, formale Sprachen, Parsing

Regex sind ein scharfes Schwert, wie dieses sollte der Umgang geübt werden (s XKCD-Cartoon)
@necrosis

mina,
@mina@berlin.social avatar

@Linkshaender

See me whilst presenting common command line tools to Windows users:

@hobbypaedagoge @necrosis

alter_unicorn,
@alter_unicorn@masto.bike avatar
alter_unicorn, (edited )
@alter_unicorn@masto.bike avatar

Did you?

VoronoV,
@VoronoV@boitam.eu avatar

@alter_unicorn Je ne sais pas de quoi il s'agit ni ce que ça veut dire mais j'ai répondu OUI pour participer 😂

stux,
@stux@mstdn.social avatar

How to

Powerfromspace1,
@Powerfromspace1@mstdn.social avatar

@stux accurate 😉

Wen,
@Wen@mastodon.scot avatar

@stux @nrmacdonald I find it helps with my more interesting emacs commands.

rdela,
@rdela@mastodon.social avatar

Chat log from this morning’s with @zachleat + @mikeneu from @cloudcannon, in which I clumsily praise @paulcuth, @robb, and @bobmonsour among others!
https://gist.github.com/rdela/e8facf1a8a31ea5223c42075cbaa9bb2

Follow on Twitch
https://www.twitch.tv/cloudcannoncms

Subscribe on YouTube
https://www.youtube.com/@cloudcannon

Today's ep. https://youtu.be/Pt5CWtEPmBM

Bonus I use to clean up the copy pasted Discord chat…

(?# Space out copy-pasted discord chat)  
(?# find )  
^(.+)\n:\s?  
(?# replace )  
\n$1:\n  
rdela,
@rdela@mastodon.social avatar
rdela,
@rdela@mastodon.social avatar
maxleibman,
@maxleibman@mastodon.social avatar

I’ve got an email-parsing project that will require some serious regular expressions.

It’s been a long while since I’ve written any regex. Can anybody recommend any good resources for putting off or avoiding doing it?

kyleejohnson,
@kyleejohnson@jawns.club avatar

@maxleibman @aronow this won’t help you procrastinate, but https://regexr.com is one of my favorite tools once I’m close to the expression I want. It lets you test expressions on text you put in, so you can tweak your expression and see the result changes instantly.

benzucker, German
@benzucker@maly.io avatar

Any wizards here?
Is there a way to match multiple linebreaks regardless of the content but only if the number of linebreaks exceeds a value like 5?

benzucker,
@benzucker@maly.io avatar

@barubary
Well there is most likely something nicer than this: \n.+\n.+\n.+\n

barubary,

@benzucker n(.*n){3}

mgorny, Polish
@mgorny@pol.social avatar

Paczka Pythona (nie mylić z wbudowanym modułem re) zbudowana jest w oparciu o szczegóły implementacji CPythona i nie obsługuje poprawnie (i autor zapowiada, że może w końcu zablokować kompilację na PyPy). Jednakże wygląda na to, że wymagająca jej paczka działa bez problemów ze zwyczajnym re.

Dzisiaj przechodzi z łatania w sposób niedoskonały paczki regex, i ignorowania szczególnych przypadków, w których nie zadziała, na rzecz łatania re-assert. Chciałbym wysłać tę trywialną łatkę autorowi, ale — jak już wcześniej narzekałem — dostałem niegdyś bana, autor nie potrafi powiedzieć dlaczego, ale nie przeszkadza mu to uważać bana za sprawiedliwego. Może po prostu proaktywnie banuje devów dystrybucji Linuksa.

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8413cf2c2955533fdf212fea3970c99cf193d4a1
https://github.com/mrabarnett/mrab-regex/issues/521
https://github.com/mrabarnett/mrab-regex/issues/404

snacktraces,
@snacktraces@hachyderm.io avatar

Need a regex for a MAC address?

On a project a while back I needed one and created it. Thought I would share here with everyone.

https://snacktraces.com/blog/regex-for-mac-address.html

barubary,

@snacktraces does c+++ not support [[:xdigit:]]?

jpaskaruk,
@jpaskaruk@growers.social avatar

What else is as incredibly impressive, and at the same time as horrifically ugly, as #RegularExpressions?

By the way, if you need to detect #chords in a text chord chart, here is the #regex you need:

"A-G?(maj|min|m|M|+|-|dim|aug)?[0-9|11|13](sus)?[0-9|11|13](add)?[0-9|11|13]*(/A-G?)?"

edit: any #Musicians out there, can you think of any edge-case chords I should test/adjust to catch? This will be part of a Free chord chart organizer, hit me with your worst.

#Programming

jpaskaruk,
@jpaskaruk@growers.social avatar

@barubary

I reached that with some modifications to something I found somewhere, and the impression I've got is that you can use | as a logical OR within a set like that.

I could be wrong there, I'll check up on that. But in the meantime, a real musical chord that fails to properly match against the regex will be more useful to me.

For the moment I'm considering it a solved problem, cause now I need to put on my javascript wetsuit and implement this into a web ui...

https://github.com/dnotes/markdown-it-chords

barubary,

@jpaskaruk | only means OR outside of a set like that. Within a [ ] set, every character is already OR'd (e.g. [abc] matches a or b or c, and [a|b] matches a or | or b).

youronlyone,
@youronlyone@c.im avatar

To my fellow who are also into programming. Can you handle / ?

Up to how much complexity?

When I was younger, it was easy. Today, I have to use a test tool! ^_^;;

@autistics @actuallyautistic

nis,

@youronlyone @autistics @actuallyautistic
Anything other than '.' and parantheses, I have to google.

youronlyone,
@youronlyone@c.im avatar

@RoundSparrow I understand that, I don't like memorisation because I'm bad with it. I'm more of, the more I use it, the more I'll remember it, not because I memorised it.

Another thing (although off-topic), the castle memory technique, usually attributed to memorisation. But, I don't know, for me, it's a storage technique. If I don't pull out a memory from its storage, I won't even remember it.

@autistics @actuallyautistic

danrot,
@danrot@mastodon.social avatar

Generally I like , but there are two huge problems for me with it:

1️⃣ I don't need it often enough, making it hard to remember more complex stuff.
2️⃣ As if 1. would not be bad enough, every tool and language uses a different dialect of it 😩

danrot,
@danrot@mastodon.social avatar

@dantleech Are you talking about vimgrep? Not even using that, configured to use rg instead. Bit rg does not support lookbehinds, which I had to use today 🙈 At least not unless you set another flag 😕

dantleech, (edited )
@dantleech@fosstodon.org avatar

@danrot no just standard :s/foo/bar/g :)

stealthmusic,
@stealthmusic@mastodon.online avatar

always succeeds in surprising me. It has a built-in tester that allows testing and changing the expression directly in your code, dealing with all the nasty escaping that is required in Java. It even highlights matching groups. 💚

hennell,

This is pretty neat. A better way to in ? 🤔

https://github.com/gherkins/regexpbuilderphp

hennell,

@kboyd @emd

"It's fair to argue that this is one of those places. But it's also fair to argue that this is not one of those places."

I laughed at this, but totally agree with it.
It's very much a 'weigh up the pros and cons' for your use case situation.

kboyd,
@kboyd@phpc.social avatar

@hennell @emd It's a similar scenario to the "Qed" BCMath wrapper library I started writing this week, although as a mere chainable interface (and not a fluent DSL interface) mine is a fair bit less complex.

There are times when it might help, and times when it would not provide a benefit.

https://github.com/beryllium/Qed

No users yet, mind you. And a library with no users may be of little value ... until that first user needs it and reaches for it.

jas_hughes,

Day 1 of

Like others, found this one much harder than other Day 1 puzzles: especially with that tricky edge case that wasn't in the examples.

I try to stick with base in my solutions, and it wasn't so elegant for extracting strings.

jas_hughes,

Day 2 of

I really like when I can re-use the same code for parts 1 and 2, passing different functions as arguments to differentiate the solutions, so I was satisfied with this one.

jas_hughes,

Day 9 of

Speedy part 1, especially for a recursive approach for me (not something I do often).

But then I spent way too long trying to implement a reverse recursion function before realizing I could just reverse the array (screaming).

hamatti,
@hamatti@mastodon.world avatar

⭐️⭐️

First day of in the bag with two stars!

Today I used with my solution:

https://github.com/Hamatti/adventofcode-2023/blob/main/src/day_1.ipynb

paulox,
@paulox@fosstodon.org avatar

@hamatti I've just read your notebook. Great work. I've appreciated your explanation of the solution and your mental process in solving the puzzle. I've to admit that I solve day 1 in a very similar way
https://github.com/pauloxnet/adventofcode/blob/main/aoc2023/day01.py

sabret00the,
@sabret00the@mas.to avatar

If I have a string and want to match all characters between the 10th character and the 48th character, what is the proper for that? [A-Z0-9]{10,48} doesn't work 😭

sabret00the,
@sabret00the@mas.to avatar

@barubary I was renaming some music files. But they were named as "0X - Artist Name - Album Name - Title.mp3" and the easiest way to rename them in a batch was via Solid Explorer using the REGEX function.

barubary,

@sabret00the Ah, I see.

vwbusguy,
@vwbusguy@mastodon.online avatar

Solved a *problem with with more regex today.

*Not actually a problem with the regex itself, but one of unclear business requirements, but for anyone that said I'd regret the DNS regex I wrote a month later, I ate that soup today and it honestly wasn't bad.

linux_mclinuxface,
@linux_mclinuxface@fosstodon.org avatar

@vwbusguy and how many problems do you have now? Say it with me … that’s right: 2 problems.

vwbusguy,
@vwbusguy@mastodon.online avatar

@linux_mclinuxface No, I have \d+? problems.

themeowcate, French
@themeowcate@piaille.fr avatar

Mon N+1 : "J'aurais besoin de comprendre. Je t'avais transmis ce gros fichier de données toutes bordéliques régurgitées et tu as fourni un CSV tout propre classé et filtré, tu pourrais me passer le script que tu avais utilisé pour faire ça ?"

Moi : "Ah mais j'ai pas de script."

Lui : "Mais comment tu as fait ça ?"

Moi, tout fier : "C'est le pouvoir de la REGEX !"

J'adore les regex. Ça résout tout, les regex ! Tiens, je sais, je vais faire un parser HTML en regex !

adarr_volte,
@adarr_volte@mamot.fr avatar

@themeowcate peut-être, un jour, j'y arriverai, en attendant ils me font péter les plombs...
Chapeau si tu maîtrises.

themeowcate,
@themeowcate@piaille.fr avatar

@pasqualeberesti
"Tu es un sorcier, Pasquale"

vwbusguy,
@vwbusguy@mastodon.online avatar

The thing about coding with is that it feels like I'm getting paid to do Sudoku puzzles for a living.

Tip for those who are asked to review code with regex: Rather than focusing on the regex itself, ask to see the automated tests that it is ran against and look for gaps in the tests rather than getting lost in the weeds with scrutinizing the regex itself unless there's an obvious significant performance problem.

vwbusguy,
@vwbusguy@mastodon.online avatar

@sudoedit This is part of why named groups are useful. You're just chaining starts and ends until you reach the eol or eof.

barubary,

@vwbusguy My advice is essentially the opposite. Focus on the , at least to get started. Regexes are code. Just like any other programming language, you have to learn the syntax and practice a bit, but the same principles apply as with program code in general.

When reviewing code, start by reading it. If there's something unclear, ask about it. Don't accept a regex consisting of 100 characters in one line without a single space. Compared to most other languages, regex syntax is terse: Few (if any) keywords, lots of symbols. Divide complex regexes into simple parts that are assembled into bigger constructs. You probably wouldn't accept a patch that adds hundreds of lines of unfactored code that has complex logic and nested loops, but no indentation or whitespace and no functions, so why write your regexes this way?

If your language builds regexes from strings, use string concatenation, formatting/indentation, comments, and named variables to make the structure of the pattern clear. If your language has the /x modifier, use it to allow sensible formatting and comments right in the regex (remember to escape with `` or [ ] any spaces that should match literally). If your language supports (?(DEFINE)...) and the (?&foo) syntax for named "regex subroutines", consider using it (but also consider restructuring your code: it might be trying to do too much in a single regex).

Once you understand the structure of the regex and how it is meant to work, it becomes much easier to review the tests: Are there any? Do they cover every input variant, exercising all parts of the regex, both matching and failing? (Failing matches are also relevant for finding performance issues: If a regex finds a match, it usually does so quickly. But a regex with exponential backtracking can take forever to fail because it'll try a huge number of variations before giving up on a string that doesn't match.)

There is an infamous regex for RFC 822 email addresses out there on the internet[1]. It is thousands of characters long and utterly incomprehensible. However, it was not written manually: It is essentially "object code", assembled by commented code using string concatenation from named variables that follow the structure of the BNF grammar in the RFC. Strive for the latter, not the former.

[1] http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

villares,
@villares@ciberlandia.pt avatar

"I hate , but I think this worked fine. I used , a helper to find and replace stuff on multiple files, for those [of us] less well versed with the traditional CLI regex workflow."

Any other tips for user friendly find-and-replace tools?

https://github.com/py5coding/py5generator/issues/350#issuecomment-1752025818

  • All
  • Subscribed
  • Moderated
  • Favorites
  • RegEx
  • DreamBathrooms
  • mdbf
  • ethstaker
  • magazineikmin
  • cubers
  • rosin
  • thenastyranch
  • Youngstown
  • InstantRegret
  • slotface
  • osvaldo12
  • kavyap
  • khanakhh
  • Durango
  • megavids
  • everett
  • tacticalgear
  • modclub
  • normalnudes
  • ngwrru68w68
  • cisconetworking
  • tester
  • GTA5RPClips
  • Leos
  • anitta
  • provamag3
  • JUstTest
  • lostlight
  • All magazines