Details

Oniguruma provides the same functionality (and almost the same interface) than standard regexp in Ruby. The main missing feature is the slash syntax to declare regular expressions /regexp/. On the other hand, oniguruma provides additional functionality:

  • Extensive multibyte support.
    reg = ORegexp.new( 'р(уби.*)', 'i', 'utf8' )
    matches = reg.match("Text: Ехал Грека Через Реку")
    puts matches[0]                    #=> "Ехал"
    
  • Named groups.
    reg = ORegexp.new( '(?<before>.*)(a)(?<after>.*)' )
    match = reg.match( 'terraforming' )
    puts match[0]                      #=> 'terraforming'
    puts match[:before]                #=> 'terr'
    puts match[:after]                 #=> 'forming'
    
  • Named backreferences.
    re = ORegexp.new('(?<pre>\w+?)\d+(?<after>\w+)')
    puts re.sub('abc123def', ' \<after>123\<pre> ')   #=> " def123abc "
    
  • Positive and negative Lookbehind.
    re = ORegexp.new('(?<!g)ong')
    m1 = re.match("song")
    puts m1[0]                         #=> ong
    m2 = re.match("gong")              #=> nil
    
  • Support for different regexp syntaxes (Java, Perl, etc.).
    re = ORegexp.new( 'section{([^}]*)}', :syntax => SYNTAX_PERL )
    re.sub('section{First}', 'subsection{\1}')        #=> "subsection{First}"