regex non-greedy matching — .*? vs .*

# Greedy — matches as much as possible
/<.+>/    on "<b>bold</b>" matches "<b>bold</b>" (the whole thing)

# Non-greedy (lazy) — matches as little as possible
/<.+?>/   on "<b>bold</b>" matches "<b>" then "</b>" separately

Greedy quantifiers (*, +, ?) match as much as possible. Add ? after them to make them lazy.

Lazy quantifier syntax

*?    zero or more (lazy)
+?    one or more (lazy)
??    zero or one (lazy)
{n,m}? n to m times (lazy)

JavaScript example

const html = '<b>bold</b> and <i>italic</i>';

// Greedy — matches from first < to last >
html.match(/<.+>/)[0]
// '<b>bold</b> and <i>italic</i>'

// Non-greedy — matches each tag separately
html.match(/<.+?>/g)
// ['<b>', '</b>', '<i>', '</i>']

Python example

import re

text = '"first" and "second"'

# Greedy — one big match from first " to last "
re.findall(r'".*"', text)
# ['"first" and "second"']

# Non-greedy — each quoted string separately
re.findall(r'".*?"', text)
# ['"first"', '"second"']

Prefer [^...] over .*? when possible

A negated character class is clearer and faster than a lazy quantifier.

# Instead of: "<.*?>"
# Use:         "<[^>]*>"  — everything that is not a >
/<[^>]*>/g