Regex basic

Basics

  • . –> Any character (改行を除く任意の一文字)

To search meta characters, escape by a backslash.
(メタ文字自体を検索したい場合は”\“でエスケープする)

Character classes

  • \d –> 0-9
  • \w –> a-z, A-Z, 0-9, _
  • \s –> whitespace、tab、new line

(Present negations with uppercase)
(大文字にすると否定になる)

  • \D -> Not \d
  • \W -> Not \w
  • \S -> Not \s

Quantifiers

  • {n} –> n 個
  • {min,max} –> min 以上、max 以下
  • {min,} –> min 以上
  • {?} –> {0,1}
  • {+} –> {1,}
  • {\*} –> {0,}

03-3008-5432
0330085432
\d{2}-?\d{4}-?\d{4}

03-3008-5432
0330085432
TEL03-3008-5432
T0330085432
\w\*\d{2}-?\d{4}-?\d{4}

Anchors

  • ^ –> starts with
  • \$ –> ends with
  • \b –> Boundary 単語の境界

1 Tom 4
2 John 9
3 Hoe 3
4 HoeFua 8

  • ^\d –> only first number 最初の数字のみ
  • \d\$ –> only last number 末尾の数字のみ
  • \b –> 単語の境界を意味する Boundary
    \bHoe\b –> only “Hoe”

OR operator

  • abc|123 -> “abc” or “123”

example.com
example.net

example\.com|example\.net
example\.(com|net)

Flags

  • /g –> Global
  • /i –> Insensitive
    /aBc/i would match “AbC”

Character Classes

  • [abc] –> “a” or “b” or “c”

bat hat cat

bat|hat|cat
(b|h|c)at
[bhc]at

Metacharacters Inside Character Classes

  • [^] -> NOT

bat hat cat eat
[^c]at –> Except for “cat”

Line breake and Tab

  • [t] –> Tab
  • [\r\n] –> Line breake(Windows)
  • [\r] –> Line breake(MacOS(~v9))
  • [\n] –> Line breake(Unix, MacOS(v10~))
  • [\r\n|\r|\n] –> Line breake(All OS)

Capture

  • ()
    • $1, $2, …

hoehoeblog, https://hoehoetester.github.io/
google, https://google.com
example, https://example.com

(.+),\s?(.+)
<a href="$2">\$1</a> –> <a href="https://hoehoetester.github.io/">hoehoeblog</a>

Reference