共计 11215 个字符,预计需要花费 29 分钟才能阅读完成。
这篇文章主要介绍 Linux 中如何实现验证邮件地址的正则表达式,文中介绍的非常详细,具有一定的参考价值,感兴趣的小伙伴们一定要看完!
邮件地址的规范来自于 RFC 5322 。有一个网站 emailregex.com 专门列出各种编程语言下的验证邮件地址的正则表达式,其中很多正则表达式都是我听说过而从未见过的复杂 mdash; mdash; 我想说,做这个网站的程序员是被邮件验证这件事伤害了多深啊!
其实,在产品环境中,一般来说并不需要这么复杂的正则表达式来做到 99.99% 正确。一般来说,从执行效率和测试覆盖率来说,只需要一个简单的版本即可:
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
那么下面我们来看看这些更严谨、更复杂的正则表达式吧:
验证邮件地址的通用正则表达式(符合 RFC 5322 标准)
(?:[a-z0-9!#$% *+/=?^_`{|}~-]+(?:\.[a-z0-9!#$% *+/=?^_`{|}~-]+)*| (?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])* )@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
由于各种语言对正则表达式的支持不同、语法差异和覆盖率不同,所以,不同语言里面的正则表达式也不同:
Python
这个是个简单的版本:
r (^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)
Javascript
这个有点复杂了:
/^[-a-z0-9~!$%^ *_=+}{\ ?]+(\.[-a-z0-9~!$%^ *_=+}{\ ?]+)*@([a-z0-9_][-a-z0-9_]*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(:[0-9]{1,5})?$/i
Swift
[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,6}
PHP
PHP 的这个版本就更复杂了,覆盖率就更大一些:
/^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22))(?:\.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))*\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\]))$/iD
Perl / Ruby
对与 PHP 的版本,Perl 和 Ruby 表示不服,可以更严谨:
(?:(?:\r\n)?[ \t])*(?:(?:(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[\t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[\t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[\t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*)|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^() @,;:\\ .\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\]\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)*\ (?:(?:\r\n)?[ \t])*(?:@(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[\t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))| (?:[^\ \r\\]|\\.|(?:(?:\r\n)?[ \t]))* (?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^() @,;:\\ .\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[ () @,;:\\ .\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\ (?:(?:\r\n)?[ \t])*))*)?;\s
Perl 5.10 及以后版本
上面的版本,嗯,我可以说是天书吗?反正我是没有解读的想法了。当然,新版本的 Perl 语言还有一个更易读的版本(你是说真的么?)
/(?(DEFINE)(? address (? mailbox) | (? group))(? mailbox (? name_addr) | (? addr_spec))(? name_addr (? display_name)? (? angle_addr))(? angle_addr (? CFWS)? (? addr_spec) (? CFWS)?)(? group (? display_name) : (?:(? mailbox_list) | (? CFWS))? ;(? CFWS)?)(? display_name (? phrase))(? mailbox_list (? mailbox) (?: , (? mailbox))*) (? addr_spec (? local_part) \@ (? domain))(? local_part (? dot_atom) | (? quoted_string))(? domain (? dot_atom) | (? domain_literal))(? domain_literal (? CFWS)? \[ (?: (? FWS)? (? dcontent))* (? FWS)?\] (? CFWS)?)(? dcontent (? dtext) | (? quoted_pair))(? dtext (? NO_WS_CTL) | [\x21-\x5a\x5e-\x7e]) (? atext (? ALPHA) | (? DIGIT) | [!#\$% *+-/=?^_`{|}~])(? atom (? CFWS)? (? atext)+ (? CFWS)?)(? dot_atom (? CFWS)? (? dot_atom_text) (? CFWS)?)(? dot_atom_text (? atext)+ (?: \. (? atext)+)*) (? text [\x01-\x09\x0b\x0c\x0e-\x7f])(? quoted_pair \\ (? text)) (? qtext (? NO_WS_CTL) | [\x21\x23-\x5b\x5d-\x7e])(? qcontent (? qtext) | (? quoted_pair))(? quoted_string (? CFWS)? (? DQUOTE) (?:(? FWS)? (? qcontent))*(? FWS)? (? DQUOTE) (? CFWS)?) (? word (? atom) | (? quoted_string))(? phrase (? word)+) # Folding white space(? FWS (?: (? WSP)* (? CRLF))? (? WSP)+)(? ctext (? NO_WS_CTL) | [\x21-\x27\x2a-\x5b\x5d-\x7e])(? ccontent (? ctext) | (? quoted_pair) | (? comment))(? comment \( (?: (? FWS)? (? ccontent))* (? FWS)? \) )(? CFWS (?: (? FWS)? (? comment))*(?: (?:(? FWS)? (? comment)) | (? FWS))) # No whitespace control(? NO_WS_CTL [\x01-\x08\x0b\x0c\x0e-\x1f\x7f]) (? ALPHA [A-Za-z])(? DIGIT [0-9])(? CRLF \x0d \x0a)(? DQUOTE )(? WSP [\x20\x09])) (? address)/x
Ruby (简单版)
Ruby 表示,其实人家还有个简单版本:
/\A([\w+\-].?)+@[a-z\d\-]+(\.[a-z]+)*\.[a-z]+\z/i
.NET
这样的版本谁没有啊 mdash; mdash;.NET 说:
^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
grep 命令
用 grep 命令在文件中查找邮件地址,我想你不会写个若干行的正则表达式吧,意思一下就行了:
$ grep -E -o \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b filename.txt
SQL Server
在 SQL Server 中也是可以用正则表达式的,不过这个代码片段应该是来自某个产品环境中的,所以,还体贴的照顾了那些把邮件地址写错的人:
select email from table_name where patindex (%[ , !+=\/() ]% , email) 0 -- Invalid characters or patindex ([@.-_]% , email) 0 -- Valid but cannot be starting character or patindex (%[@.-_] , email) 0 -- Valid but cannot be ending character or email not like %@%.% -- Must contain at least one @ and one . or email like %..% -- Cannot have two periods in a row or email like %@%@% -- Cannot have two @ anywhere or email like %.@% or email like %@.% -- Cannot have @ and . next to each other or email like %.cm or email like %.co -- Camaroon or Colombia? Typos. or email like %.or or email like %.ne -- Missing last letter
Oracle PL/SQL
这个是不是有点偷懒?尤其是在那些“复杂”的正则表达式之后:
SELECT email FROM table_nameWHERE REGEXP_LIKE (email, [A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,4}
MySQL
好吧,看来 *** 也一样懒:
SELECT * FROM `users` WHERE `email` NOT REGEXP ^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$
以上是“Linux 中如何实现验证邮件地址的正则表达式”这篇文章的所有内容,感谢各位的阅读!希望分享的内容对大家有帮助,更多相关知识,欢迎关注丸趣 TV 行业资讯频道!