Posted by
| Nick Gammon
Australia (22,973 posts) bio
Forum Administrator |
Message
| By my measurements, yours is not 2x faster. In some cases it is slightly faster. Arguably, if it isn't providing the results you want, then the speed doesn't matter. You could always remove the trailing "and" with a string.sub, but that would take time. I made up a test bed:
require "re"
require "tprint"
c = re.compile [[
parse <- {| {noDelim} lastDelim |} -- look for all up to the last delimiter followed by the last part
delim <- 'and' -- our delimiter
noDelim <- (!lastDelim .)* -- zero or more characters without the last delimiter
lastDelim <- delim {(!delim .)*} !. -- the delimiter without any more delimiters and then end of subject
]]
pat = re.compile "{| {g <- . g / 'and'} {.*} |}" -- Albert Chan pattern
function showResults (result, start, finish)
if not result then
print ("no match")
else
tprint (result)
end -- if
print (string.format ("Time taken = %0.3f us", (finish - start) * 1e6))
end -- showResults
function test (which)
print (string.rep ("=", 20))
print ("Testing:", which)
print (string.rep ("-", 10))
print "Nick"
start = utils.timer ()
result = lpeg.match (c, which)
finish = utils.timer ()
showResults (result, start, finish)
print (string.rep ("-", 10))
print "Albert"
start = utils.timer ()
result = lpeg.match (pat, which)
finish = utils.timer ()
showResults (result, start, finish)
end -- test
tests = {
"foo and bar and whatever",
"foo and bar",
"XandY",
"foo",
"Xand",
"andY",
"and",
"",
}
for _, v in ipairs (tests) do
test (v)
end -- for
You will notice that the very case you were interested in (multiple instances of the word "and") your expression is almost 4 times as slow.
====================
Testing: foo and bar and whatever
----------
Nick
1="foo and bar "
2=" whatever"
Time taken = 11.733 us
----------
Albert
1="foo and bar and"
2=" whatever"
Time taken = 43.302 us
====================
Testing: foo and bar
----------
Nick
1="foo "
2=" bar"
Time taken = 5.029 us
----------
Albert
1="foo and"
2=" bar"
Time taken = 4.749 us
====================
Testing: XandY
----------
Nick
1="X"
2="Y"
Time taken = 4.749 us
----------
Albert
1="Xand"
2="Y"
Time taken = 4.749 us
====================
Testing: foo
----------
Nick
no match
Time taken = 3.073 us
----------
Albert
no match
Time taken = 2.794 us
====================
Testing: Xand
----------
Nick
1="X"
2=""
Time taken = 4.470 us
----------
Albert
1="Xand"
2=""
Time taken = 5.867 us
====================
Testing: andY
----------
Nick
1=""
2="Y"
Time taken = 4.749 us
----------
Albert
1="and"
2="Y"
Time taken = 4.470 us
====================
Testing: and
----------
Nick
1=""
2=""
Time taken = 5.029 us
----------
Albert
1="and"
2=""
Time taken = 4.470 us
====================
Testing:
----------
Nick
no match
Time taken = 3.073 us
----------
Albert
no match
Time taken = 3.073 us
I took the compile part out of the timing, because you should really only do that once, and the speed you are really interested in is execution speed (that is, match speed).
Having said all that, your pattern looks nice and elegant. :) |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | top |
|