1
0
mirror of https://github.com/php/php-src.git synced 2026-03-24 00:02:20 +01:00

Optimize levenshtein a bit for memory usage (#13830)

When all costs are equal, levenshtein fulfills the requirements
of being a metric. A metric is symmetric, so we can swap the strings in
that case. Since we use rows of a partial matrix of length |string2| we
can make the choice of using string1 instead if |string1| < |string2|,
which will optimize memory usage and CPU time.
This commit is contained in:
Niels Dossche
2024-04-02 23:00:10 +02:00
committed by GitHub
parent 33a523f64e
commit 04e0d80554

View File

@@ -32,6 +32,15 @@ static zend_long reference_levdist(const zend_string *string1, const zend_string
return ZSTR_LEN(string1) * cost_del;
}
/* When all costs are equal, levenshtein fulfills the requirements of a metric, which means
* that the distance is symmetric. If string1 is shorter than string 2 we can save memory (and CPU time)
* by having shorter rows (p1 & p2). */
if (ZSTR_LEN(string1) < ZSTR_LEN(string2) && cost_ins == cost_rep && cost_rep == cost_del) {
const zend_string *tmp = string1;
string1 = string2;
string2 = tmp;
}
p1 = safe_emalloc((ZSTR_LEN(string2) + 1), sizeof(zend_long), 0);
p2 = safe_emalloc((ZSTR_LEN(string2) + 1), sizeof(zend_long), 0);