Hacker Read

shmerl · 2013-04-17 18:33:13+00:00

It's not the same as Ctrl+R - it actually doesn't look for a match in the middle of the string, only from the beginning of it, so while it's better with going back and forth between the matches, it matches less data to begin with.

flukus | karma 10283 | avg karma 1.36 · | 2016-07-20 23:51:39+00:00

It still prioritizes exact string matches I think.

maweki | karma 2051 | avg karma 3.8 · | 2017-04-24 23:34:41+00:00

I wonder whether it would help to match from both sides (start and end) simultaneously, since you know you're not looking in the middle of the string. You also don't care about capture groups.

est | karma 8357 | avg karma 2.2 · | 2019-10-15 01:46:31+00:00

> Sometimes you just check for a match

re.match vs re.search

reply

hvdijk | karma | avg karma · | 2021-04-23 06:49:26+00:00

It solves a slightly different problem, searching for one substring instead of searching for one of many, so it would not be fair to compare the two.

Liquid_Fire | karma 730 | avg karma 1.95 · | 2019-10-15 11:23:41+00:00

re.search is different. re.match matches from the beginning of the string, while re.search matches from anywhere in the string.

bithub | karma 19 | avg karma 1.9 · | 2016-05-31 13:40:30

Shorter yes, but reversing the whole string and comparing it with the input value is actually slower/less efficent than the algorithm shown in the article. Sorry for my nitpicking ;)

ygra | karma 5786 | avg karma 1.96 · | 2023-02-07 05:20:23

The difference may be regex matching. This can often be optimized to an impressive degree, depending on the regex, but unless it's a simple substring search without any metacharacters, I'm not sure those approaches are comparable.

burntsushi | karma 13683 | avg karma 4.52 · | 2016-08-24 13:18:56

Pretty much, yes. Longer strings should also match fewer times than short strings, which should also speed things up (because reporting a match has its own overhead associated with it, like printing text to the terminal).

minikomi | karma 3543 | avg karma 2.89 · | 2015-05-27 09:59:00+00:00

Ah, too late to edit, but using for*/first to terminate the first time it finds a match is faster still.

mdaniel | karma 5981 | avg karma 1.56 · | 2018-01-21 20:37:23+00:00

As a for-your-consideration, C-r leaves the cursor at the matched string when doing reverse search; so "echo helol world", enter, mutter "rats", C-r, o-l-sp, right (just to break the search), and voila you are now positioned on the offending substring

curi | karma 535 | avg karma 0.88 · | 2008-01-25 05:17:30+00:00

so you build his graph thing at the end. and if you go part way down, then dead end (no match), then don't you have to go back to the 2nd char of the non-match and try to match a word from there? and thus do a lot more comparisons than the number of bytes.

the reason is when you put 100+ words in the tree, they'll share some substrings.

reply

adestefan | karma 3297 | avg karma 2.72 · | 2015-06-21 13:57:09+00:00

If you're doing a substring match, then use a substring function.

burntsushi | karma 13683 | avg karma 4.52 · | 2017-05-26 15:59:28+00:00

Eh? PCMPISTRI has a few different modes of operation, including full substring search and character classes. e.g., You can use PCMPISTRI on a needle that contains adjacent classes. For example, `azAZ09` would check if any byte in the search string is in any of the ranges a-z, A-Z and 0-9.

Regardless, in the OP, they're specifically looking for one of a small number of bytes, which is exactly what PCMPISTRI is supposed to be good for.

With that said, my experience mirrors glangdale's. Every time I've tried to use PCMPISTRI, it's either been slower than other methods or not enough of an improvement to justify it.

reply

ghusbands | karma 2989 | avg karma 3.52 · | 2020-10-13 19:55:15+00:00

> pre-processing that makes sense only if you know you are going to search for the same pattern repeatedly

The pre-processing is typically worth it if you're searching through a large string, even just once.

reply

LK5ZJwMwgBbHuVI | karma 23 | avg karma 1.15 · | 2024-03-20 17:46:55

Problem is, plenty of software doesn't actually look at the match but rather just validates that there was a match (and then continues to use the input to that match).

underyx | karma 2129 | avg karma 5.26 · | 2016-08-24 18:16:48+00:00

Wait, I always tried to make my pattern as short as possible and I thought it would speed up the searches. So I guess this means I'm actually better off searching for the longest possible match then?

marcosdumay | karma 27273 | avg karma 1.67 · | 2019-03-20 16:59:18

Matches should monotonically disappear and not get reordered as you enter more letters.

This is trivial to do on exact contexts (like start menus - why none gets this?). It's ok to wait until the user enters some letters to start showing options, it's also ok to limit the number of results as long as you say there are more somewhere. What is not ok is to show a single match, and then after the user press another key show two matches, with the first one gone.

Distance based matching can't strictly follow this rule, but if you are optimizing one, it is a good goal to get approximately right.

reply

Rynant | karma 23 | avg karma 1.15 · | 2014-06-04 14:03:53

This is not the same as finding the last match though. The parent's example will match '2' in '1 of 2 steps.'

layer8 | karma 23301 | avg karma 2.59 · | 2024-03-29 02:18:52

So it has to be two thousand identical strings? Then I don’t understand the benefit over search & replace.