stevensanderson, to programming
@stevensanderson@mstdn.social avatar

If you work with text data in R, the gregexpr() function is essential for pattern matching. It finds all occurrences of a pattern within a string. Key parameters include pattern, text, ignore.case, perl, fixed, and useBytes. You can match characters, ignore case, use advanced regex, and search fixed strings.

#R

Post: https://www.spsanderson.com/steveondata/posts/2024-05-17/

image/png
image/png

stevensanderson, to programming
@stevensanderson@mstdn.social avatar

🎉 New Post Alert! 🎉

Counting words in a string is a fundamental task in data analysis.

  1. Base R: Use strsplit(), a straightforward method to split strings and count words.

  2. stringr: The str_split() function from the stringr package makes the code more readable.

  3. stringi: For powerful and efficient string manipulation, stri_split_regex() from the stringi package is your go-to.

Happy coding! 🚀

#R

Post: https://www.spsanderson.com/steveondata/posts/2024-05-16/

stevensanderson, to programming
@stevensanderson@mstdn.social avatar

🔎 Selecting Columns Containing a Specific String in R: A Quick Guide 🚀

Hey R users! Need to select columns by a specific string? Here's how in base R, stringr, stringi, dplyr, and with a bonus from data.table.

🆒 R
✅ grepl
📦 stringr
📦 stringi
📦 dplyr

Bonus: 📦 data.table
library(data.table)
df_price <- df[, names(df) %like% "price"]

Happy coding! 🚀

Post: https://www.spsanderson.com/steveondata/posts/2024-05-15/

#R #RProgramming #Programming #RStats #Coding #RegularExpressions #RegEx #stringr #stringi #dplyr #datatable #baseR

image/png
image/png
image/png

lizardbill, to RegEx
@lizardbill@hachyderm.io avatar

You can't parse [X]HTML with regex. https://stackoverflow.com/a/1732454/1288

stevensanderson, to programming
@stevensanderson@mstdn.social avatar

🔍 Quick Guide: Detecting Strings in R

In my latest blog post, I cover how to find specific strings in data columns using the str_detect function from the stringr package and base R functions. You'll see practical examples with both grepl for identifying matches and gregexpr for counting occurrences.

Read more here: https://www.spsanderson.com/steveondata/posts/2024-05-10/ and explore ways to make string detection a breeze in your data work!

#RStats #DataCleaning #R #RProgramming #Programming #Data #Regex

image/png

donwatkins, to RegEx
@donwatkins@fosstodon.org avatar

Regular Expressions #2: An example – Both.org https://www.both.org/?p=5154

stevensanderson, to stackoverflow
@stevensanderson@mstdn.social avatar

I was working on a problem today where I needed to pick out a department and a sub issue that were attached to each other as a single code in a comment.

I first went to use the traditional SUSTRING(comment_string, 1, 5) IN (my list of codes) but it was slow.

So off to work, I learned something new via and learned to make a sarge-able LIKE with what ever I want in the narrower results.

Nice to learn something new.

villetakanen, to RegEx
@villetakanen@mementomori.social avatar
profoundlynerdy, to raku
@profoundlynerdy@bitbang.social avatar

What are some underappreciated superpowers that and/or has EXCLUDING and ?

NireBryce, to RegEx
@NireBryce@hachyderm.io avatar

does there yet exist any application that can take a multi-selection and spit out a that will match that in any file going forward?

stevensanderson, to RegEx
@stevensanderson@mstdn.social avatar

I decided to make a blog post out of a problem I worked on a day or two ago and thankfully I was also pointed to another solution from @embiggenData which worked well too.

#R

Post: https://www.spsanderson.com/steveondata/posts/2024-04-12/

image/png
image/png

necrosis, to RegEx German
@necrosis@chaos.social avatar

Liebe Informatik Lehrkräfte, bitte bringt euren Schülerinnen und Schülern bei.

Es hilft im Job ungemein. 😅
Ich wünschte ich hätte das schon in der Schule gelernt. 🥹

proactiveservices, to RegEx
@proactiveservices@fosstodon.org avatar
alter_unicorn, to RegEx
@alter_unicorn@masto.bike avatar
stevensanderson, to programming
@stevensanderson@mstdn.social avatar
stux, to RegEx
@stux@mstdn.social avatar

How to #REGEX

rdela, to RegEx
@rdela@mastodon.social avatar

Chat log from this morning’s with @zachleat + @mikeneu from @cloudcannon, in which I clumsily praise @paulcuth, @robb, and @bobmonsour among others!
https://gist.github.com/rdela/e8facf1a8a31ea5223c42075cbaa9bb2

Follow on Twitch
https://www.twitch.tv/cloudcannoncms

Subscribe on YouTube
https://www.youtube.com/@cloudcannon

Today's ep. https://youtu.be/Pt5CWtEPmBM

Bonus I use to clean up the copy pasted Discord chat…

(?# Space out copy-pasted discord chat)  
(?# find )  
^(.+)\n:\s?  
(?# replace )  
\n$1:\n  
danyeaw, (edited ) to python
@danyeaw@fosstodon.org avatar

I'm excited about giving a talk about Matching Text with Regular Expression at Michigan Python tomorrow at 7pm EST at Washtenaw Community College in Ann Arbor and online. All are welcome! https://www.meetup.com/michigan-python/events/299577684/

Mehrad, to RegEx
@Mehrad@fosstodon.org avatar

I really do enjoy #regex. It always cheers me up. Kinda feel like a cool puzzle.

I learnt regex when I was learning Perl back in the day. But big shout-out to https://www.regular-expressions.info/ and https://regexr.com/ for providing such good resources for me to help my friends and colleagues also learn regex and enjoy writing it.

neustradamus, to random
@neustradamus@mastodon.social avatar
mjgardner,
@mjgardner@social.sdf.org avatar

@neustradamus continues to be a misnomer; it’s a modified subset of with dozens of differences: https://pcre.org/current/doc/html/pcre2compat.html

It's not "(C)ompatible." Accept no substitutes: https://perldoc.perl.org/perlre

maxleibman, to RegEx
@maxleibman@mastodon.social avatar

I’ve got an email-parsing project that will require some serious regular expressions.

It’s been a long while since I’ve written any regex. Can anybody recommend any good resources for putting off or avoiding doing it?

benzucker, to RegEx German
@benzucker@maly.io avatar

Any wizards here?
Is there a way to match multiple linebreaks regardless of the content but only if the number of linebreaks exceeds a value like 5?

castarco, to til
@castarco@hachyderm.io avatar

Today I learnt that adding ? after * transforms a expression from being "greedy" into "lazy" (important for performance, safe validators, and protection against DoS attacks).

I don't know how I missed this bit of knowledge for so long. :blobfoxbox:

castarco,
@castarco@hachyderm.io avatar

@barubary

Sure. What follows is a dumb example ( executed in https://regex101.com/ ), but illustrates my point.

In this particular case you could say that ? is semantically required for &lt;script&gt; because we could have more than one, but many times we don't have this distinction and it still affects how many steps the has to perform.

(Sorry for having the text selected in the 2nd image, I was copying it for the alt of the images 😅 )

[Result: 1 match, 75 steps, 0.0ms Regexp (with the ? symbol): /([sS]*?)</script>/gi

Text:

<main> Hello World <script>console.log("hello!"); More stuff Just a decoy!](https://media.hachyderm.io/media_attachments/files/111/914/833/409/432/020/original/3925f50f868f8a82.png)
mgorny, to RegEx Polish
@mgorny@pol.social avatar

Paczka Pythona (nie mylić z wbudowanym modułem re) zbudowana jest w oparciu o szczegóły implementacji CPythona i nie obsługuje poprawnie (i autor zapowiada, że może w końcu zablokować kompilację na PyPy). Jednakże wygląda na to, że wymagająca jej paczka działa bez problemów ze zwyczajnym re.

Dzisiaj przechodzi z łatania w sposób niedoskonały paczki regex, i ignorowania szczególnych przypadków, w których nie zadziała, na rzecz łatania re-assert. Chciałbym wysłać tę trywialną łatkę autorowi, ale — jak już wcześniej narzekałem — dostałem niegdyś bana, autor nie potrafi powiedzieć dlaczego, ale nie przeszkadza mu to uważać bana za sprawiedliwego. Może po prostu proaktywnie banuje devów dystrybucji Linuksa.

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8413cf2c2955533fdf212fea3970c99cf193d4a1
https://github.com/mrabarnett/mrab-regex/issues/521
https://github.com/mrabarnett/mrab-regex/issues/404

  • All
  • Subscribed
  • Moderated
  • Favorites
  • Leos
  • rosin
  • ngwrru68w68
  • tacticalgear
  • DreamBathrooms
  • mdbf
  • magazineikmin
  • thenastyranch
  • Youngstown
  • Durango
  • slotface
  • everett
  • vwfavf
  • kavyap
  • megavids
  • anitta
  • khanakhh
  • GTA5RPClips
  • cisconetworking
  • InstantRegret
  • ethstaker
  • osvaldo12
  • tester
  • provamag3
  • modclub
  • cubers
  • normalnudes
  • JUstTest
  • All magazines