regex non-greedy matching — .*? vs .*
Quick Answer
# Greedy — matches as much as possible
/<.+>/ on "<b>bold</b>" matches "<b>bold</b>" (the whole thing)
# Non-greedy (lazy) — matches as little as possible
/<.+?>/ on "<b>bold</b>" matches "<b>" then "</b>" separately
Usage
Greedy quantifiers (*, +, ?) match as much as possible. Add ? after them to make them lazy.
Other causes & fixes
Lazy quantifier syntax
*? zero or more (lazy)
+? one or more (lazy)
?? zero or one (lazy)
{n,m}? n to m times (lazy)
JavaScript example
const html = '<b>bold</b> and <i>italic</i>';
// Greedy — matches from first < to last >
html.match(/<.+>/)[0]
// '<b>bold</b> and <i>italic</i>'
// Non-greedy — matches each tag separately
html.match(/<.+?>/g)
// ['<b>', '</b>', '<i>', '</i>']
Python example
import re
text = '"first" and "second"'
# Greedy — one big match from first " to last "
re.findall(r'".*"', text)
# ['"first" and "second"']
# Non-greedy — each quoted string separately
re.findall(r'".*?"', text)
# ['"first"', '"second"']
Prefer [^...] over .*? when possible
A negated character class is clearer and faster than a lazy quantifier.
# Instead of: "<.*?>"
# Use: "<[^>]*>" — everything that is not a >
/<[^>]*>/g
Related