Feature or enhancement
Proposal:
Very similar to #148729. I noticed that for regexes of the form
re.compile("^foo", re.MULTILINE)
there's still a character by character loop calling SRE(match).
Replacing this with memchr scanning for \n in the single-byte character string case gives speedups of about 8x on a macbook M4.
A patch would look like this:
diff --git a/Modules/_sre/sre_lib.h b/Modules/_sre/sre_lib.h
index 4f1269988b9..11d341ea2ab 100644
--- a/Modules/_sre/sre_lib.h
+++ b/Modules/_sre/sre_lib.h
@@ -1869,12 +1869,42 @@ SRE(search)(SRE_STATE* state, SRE_CODE* pattern)
state->start = state->ptr = ptr = end;
return 0;
}
- while (status == 0 && ptr < end) {
- ptr++;
- RESET_CAPTURE_GROUP();
- TRACE(("|%p|%p|SEARCH\n", pattern, ptr));
- state->start = state->ptr = ptr;
- status = SRE(match)(state, pattern, 0);
+ if (pattern[0] == SRE_OP_AT &&
+ pattern[1] == SRE_AT_BEGINNING_LINE)
+ {
+ /* Skip to line boundary */
+ while (status == 0 && ptr < end) {
+ ptr++;
+ if (!SRE_IS_LINEBREAK((int) ptr[-1])) {
+#if SIZEOF_SRE_CHAR == 1
+ ptr = (SRE_CHAR *)memchr(ptr, '\n', end - ptr);
+ if (!ptr) {
+ break;
+ }
+#else
+ while (ptr < end && !SRE_IS_LINEBREAK((int) *ptr)) {
+ ptr++;
+ }
+ if (ptr >= end) {
+ break;
+ }
+#endif
+ ptr++;
+ }
+ RESET_CAPTURE_GROUP();
+ TRACE(("|%p|%p|SEARCH\n", pattern, ptr));
+ state->start = state->ptr = ptr;
+ status = SRE(match)(state, pattern, 0);
+ }
+ }
+ else {
+ while (status == 0 && ptr < end) {
+ ptr++;
+ RESET_CAPTURE_GROUP();
+ TRACE(("|%p|%p|SEARCH\n", pattern, ptr));
+ state->start = state->ptr = ptr;
+ status = SRE(match)(state, pattern, 0);
+ }
}
}
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Feature or enhancement
Proposal:
Very similar to #148729. I noticed that for regexes of the form
there's still a character by character loop calling
SRE(match).Replacing this with
memchrscanning for\nin the single-byte character string case gives speedups of about 8x on a macbook M4.A patch would look like this:
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response