站内搜索: 请输入搜索关键词
当前页面: 图书首页 > Java Regular Expressions: Taming the java.util.regex Engine

Appendix C: Common Regex Patterns - Java Regular Expressions: Taming the java.util.regex Engine

Previous Section Next Section

Appendix C: Common Regex Patterns

This appendix presents some practical regex patterns that you can use for common matching and validation tasks.

Table C-1: IP Address ^(([0-1]?\d{1,2}\.)|(2[0-4]\d\.)|(25[0-5]\.)){3} (([0-1]?\d{1,2})|(2[0-4]\d)|(25[0-5]))$

Regex

Description

^

Beginning of line

(

A group consisting of

(

A subgroup consisting of

[0-1]?

Zero or one, both optional, followed by

\d

Any digit

{1,2}

Repeated one or two times, followed by

\.

A period

)

Close subgroup

|

Or

(

A subgroup consisting of

2

The digit 2, followed by

[0-4]

Any digit from 0 to 4, followed by

\d

Any digit, followed by

\.

A period

)

Close subgroup

|

Or

(

A subgroup consisting of

2

The digit 2, followed by

5

The digit 5, followed by

[0-5]

Any digit from 0 to 5, followed by

\.

A period

)

Close subgroup

)

Close group

{3}

Repeated three times, followed by

(

A group consisting of

(

A subgroup consisting of

[0-1]?

Zero or one, both optional, followed by

\d{1,2}

Any two digits

)

Close subgroup

|

Or

(

A subgroup consisting of

2

The digit 2, followed by

[0-4]

Any digit from 0 to 4, followed by

\d

Any digit

)

Close subgroup

|

Or

(

A subgroup consisting of

2

The digit 2, followed by

5

The digit 5, followed by

[0-5]

Any digit from 0 to 5, followed by

)

Close subgroup

)

Subgroup

$

End of line

* In English: Three sets of three digits separated by periods, each ranging from 0 to 255, followed by three digits, each ranging from 0 to 255.

Note?/td>

The following pattern also matches the IP address, but all the groups have been marked as noncapturing:

(?:(?:[0-1]?\d{1,2}\.)|(?:2[0-4]\d\.)|(?:25[0-5]\.)){3}(?:(?:[0-1]?\
d{1,2})|(?:2[0-4]\d)|(?:25[0-5]))

This is slightly more efficient than the previous pattern, but it's less legible.

Table C-2: Simple E-mail ^(\p{Alnum}+(\.|\_|\-)?)*\p{Alnum}@(\p{Alnum}+ (\.|\_|\-)?)*\p{Alpha}$

Regex

Description

^

Beginning of line

(

A group consisting of

\p{Alnum}

A letter or a digit

+

Repeated one or more times, followed by

(

A subgroup consisting of

\.

A period

|

Or

\_

An underscore

|

Or

\-

A hyphen

)

Close subgroup

?

The preceding punctuation is optional

)

Close group

*

The previous group can be repeated zero or more times,

?/td>

followed by

\p{Alnum}

A letter or a digit, followed by

@

An @ symbol, followed by

\p{Alnum}

A letter or a digit

+

Repeated one or more times, followed by

(

A group consisting of

\.

A period

|

Or

\_

An underscore

|

Or

\-

A hyphen

)

Close group

?

The preceding punctuation is optional

\p{Alpha}

An upper- or lowercase letter

$

End of line

* In English: Any number of alphanumeric characters followed by single hyphens, periods, or underscores, but ending in an alphanumeric character; followed by an @ symbol; followed by any number of alphanumeric characters; followed by single hyphens, periods, or underscores, but ending in an upper- or lowercase character.

Note?/td>

The following pattern matches the previous one exactly, except that it allows an IP address as well:

^(\p{Alnum}+(\.|\_|\-)?)*\p{Alnum}@(((\p{Alnum}+(\.|\_|\-)?)*
\p{Alpha})|((([0-1]?\d{1,2}\.)|(2[0-4]\d\.)|(25[0-5]\.)){3}(([0-1]?\d{1,2})|
(2[0-4]\d)|(25[0-5]))))$ 

For a breakdown of the IP address pattern, please see Table C-1.

Table C-3: Digit Repeated Exactly n Times \d{n}, Where n Is the Number of Digits Needed

Regex

Description

\d

Any number

{n}

Repeated n times

* In English: n digits. Thus, if n was equal to 4, any four digits.

Table C-4: Characters Repeated Exactly n Times \w{n}, Where n Is the Number of Characters Needed

Regex

Description

\w

Any number, any digit, or an underscore symbol

{n}

Repeated n times

* In English: n characters. Thus, if n was equal to 4, any four characters.

Table C-5: Characters Repeated n to m Times \w{n.m}, Where n Is the Number of Characters Needed

Regex

Description

\w

Any number, any digit, or an underscore symbol

{n

Repeated n times

m}

But not more than m times

* In English: n characters. Thus, if n was equal to 4 and m was equal to 9, any four, five, six, seven, eight, or nine characters.

Table C-6: Credit Cards: Visa, MasterCard, American Express, and Discover ^((4\d{3})|(5[1-5]\d{2})|(6011))-?\d{4}-?\d{4}-?\d{4}|3[4,7]\d{13}$

Regex

Description

^

Beginning of line

(

A group consisting of

(

A subgroup consisting of

4

the digit four, followed by

\d[3]

Any three digits

)

Close subgroup

|

Or

(

A subgroup consisting of

5

The digit 5, followed by

[1-5]

Any digit ranging from 1 to 5, followed by

\d[2]

Any two digits

)

Close subgroup

|

Or

(

A subgroup consisting of

6001

The digits 6, 0, 0, 1

)

Close subgroup, followed by

-?

An optional hyphen, followed by

\d{4]

Any four digits, followed by

-?

An optional hyphen, followed by

\d{4]

Any four digits, followed by

-?

An optional hyphen, followed by

\d{4]

Any four digits, followed by

-?

An optional hyphen

|

Or

3

The digit 3, followed by

[4,7]

A 4 or a 7, followed by

\d{13]

Any thirteen digits, followed by

$

End of line

* In English: A number starting with 4 and three digits, or 5 and three digits, or 6011, followed by a hyphen, followed by three sets of four digits, or 34 and thirteen digits, or 37 and thirteen digits.

Note?/td>

This regex does not, and cannot, conform to mod 10 verification. To find a Java program that does, please visit http://www.influxs.com.

Table C-7: Real Number ^[+-]?\d+(\.\d+)?$

Regex

Description

^

Beginning of line, followed by

[+-]?

An optional plus or a minus sign

\d+

Followed by one or more digits, followed by

(

A group consisting of

\.

A period, followed by

\d+

One or more digits

)?

Close group, and make it optional

$

End of line

* In English: Any number of digits followed by an optional decimal component.


Previous Section Next Section