timbray,
@timbray@cosocial.ca avatar

Hey, a minor announcement: An IETF thing I helped with has been approved and will be getting an RFC number in a few weeks: https://www.ietf.org/archive/id/draft-ietf-jsonpath-iregexp-08.html

A small-ish subset of regular expression syntax/semantics, should work interoperably across the (many) popular regexp implementations.

Mostly for use in other specifications I think?

pepita,

deleted_by_author

  • Loading...
  • timbray,
    @timbray@cosocial.ca avatar

    @pepita My impression was that in practical terms those ranges are safe to use for the foreseeable future. \d and \s are actively pernicious.

    aegilops,

    @timbray Aside from the engines you mentioned, is it worth also referencing how this affects grep, grep -E and POSIX regex? Grep is still a mainstay!

    Java and C# are also major languages that are sometimes used in protocol processing relevant to the IETF.

    Can Java’s built-in regex conform easily?

    https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

    C# says it is Perl 5 compatible, so I’d expect so.

    timbray,
    @timbray@cosocial.ca avatar

    @aegilops My impression is that the I-Regexp subset should Just Work for grep & friends. I believe Java is a PCRE subset so that should be fine too.

    aegilops,

    @timbray I wonder how close the match is to #Hyperscan. That’s Intel’s high-perf regex engine, which disallows backreferences, lookarounds and capture groups, much like I-Regex.

    https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-hyperscan.html

    If you want more fodder for considering which regexes would work in your spec, they have a corpus of thousands available.

    Hyperscan does allow ., \s, \d and so on, so you’re even stricter!

    #regex

    nogweii,
    @nogweii@nogweii.net avatar

    @timbray very cool! Love to see it. I didn't see any discussion on the greediness of the quantifiers. I guess the interoperable assumption is very greedy by default.

    underlap,
    @underlap@fosstodon.org avatar

    @timbray Congratulations to you and Carsten! Let's see who uses this, apart from JSONPath...

    josephholsten,

    @timbray That’s a lovely ABNF! I don’t often get to see many more complex than RFC-2822 headers.

    spacecowboy,

    @timbray great initiative!

    robpike,
    @robpike@hachyderm.io avatar

    @timbray No backreferences!! Yay. I hope all implementations that may arise later appreciate that it can be done in O(n) time always. Theory is your friend. Nice.

    timbray,
    @timbray@cosocial.ca avatar

    @robpike Realistically, probably only of use for boolean does-it-match-or-not, but that's an interestingly big part of the problem space.

    robpike, (edited )
    @robpike@hachyderm.io avatar

    @timbray I still think this is great. Thanks or congratulations or whatever is appropriate.

    modenaboy,

    @robpike @timbray I second the motion. This sounds like a very valuable approach, focusing on a common and core use case with an eye towards reliability and interoperability. Thanks for all the work that went into this!

    rlb,

    @timbray @robpike spelling "does it match or not" is easier outside the regexp language anyway, IMO. People try to cram too many things into the regexp matching and we end up with attempts to range-limit IPv4 octets textually.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • magazineikmin
  • ethstaker
  • InstantRegret
  • tacticalgear
  • rosin
  • love
  • Youngstown
  • slotface
  • ngwrru68w68
  • kavyap
  • cubers
  • thenastyranch
  • mdbf
  • provamag3
  • modclub
  • GTA5RPClips
  • normalnudes
  • khanakhh
  • everett
  • cisconetworking
  • osvaldo12
  • anitta
  • Leos
  • Durango
  • tester
  • megavids
  • JUstTest
  • All magazines