Regular expression fun with emails; top level domain not required when it should be

I'm trying to create a regular expressions that will filter valid emails using PHP and have ran into an issue that conflicts with what I understand of regular expressions. Here is the code that I am using.

if (!preg_match('/^[-a-zA-Z0-9_.]+@[-a-zA-Z0-9]+.[a-zA-Z]{2,4}$/', $string)) {
return $false;

Now from the materials that I've researched, this should allow content before the @ to be multiple letters, numbers, underscores and periods, then afterwards to allow multiple letters and numbers, then require a period, then two to four letters for the top level domain.

However, right now it ignores the requirement for having the top level domain section. For example a@b.c obviously is valid (and should be), but a@b is also returning as valid, which I want ti to be flagged as not so.

I'm sure I"m missing something, but after browsing google for an hour I'm at a loss as to what it could be. Anyone have an answer for this conundrum?

EDIT: The speed that answers arrive here makes this site superior over it's competitors. Well done!

13.10.2009 18:49:03
Your regular expression does not match a@b.c.d.
Greg Hewgill 13.10.2009 18:54:50
Is it supposed to match any email address, meaning just check if it's a valid one? Check out PHP's own filter_var method using the FILTER_VALIDATE_EMAIL constant. Might do the trick just fine..
Jörg 13.10.2009 18:58:27
Ya I think I might just use it. This isn't behaving as I've been told through multiple sources.
canadiancreed 13.10.2009 19:10:36

You should escape . when it's not a part of the group: '/^[-a-zA-Z0-9_.]+@[-a-zA-Z0-9]+\.[a-zA-Z]{2,4}$/' Otherwise it will be equal to any letter:

  • . - any symbol (but not the newline \n if not using s modifier)
  • \. - dot symbol
  • [.] - dot symbol (inside symbol group)
13.10.2009 19:00:22
Instead of \., I find [.] to be more readable. It puts the . character into its own group.
Thomas Owens 13.10.2009 18:55:41
Agreed. Although it didnt' make a difference. Both \. and [.] still say that the email passed is valid.
canadiancreed 13.10.2009 19:01:38
I've just executed var_dump(preg_match('/^[-a-zA-Z0-9_.]+@[-a-zA-Z0-9]+\.[a-zA-Z]{2,4}$/', 'a@basd')); and it prints int(0) which is false
Ivan Nevostruev 13.10.2009 19:15:24
Yep I found the mistake on my end. My apologies for the erroneous reply earlier.
canadiancreed 13.10.2009 19:30:18 or won't validate with that regular expression.
Mauricio 23.10.2009 22:50:02

Rather than rolling your own, perhaps you should read the article How to Find or Validate an Email Address on The article also discusses reasons why you might not want to validate an email address using a regular expression and provides 3 regular expressions that you might consider using instead of your own.

13.10.2009 18:53:29

A single dot in a regular expression means "match any character". And that's exactly what is does when a top level domain is missing (also when it's present, of course).

Thus you should change your code like that:

if (!preg_match('/^[-a-zA-Z0-9_.]+@[-a-zA-Z0-9]+\.[a-zA-Z]{2,4}$/', $string)) {
    return $false;

And by the way: a lot more characters are allowed in the local part than what your regular expression currently allows for.

13.10.2009 18:56:10
Agreed on your link. I figured though that I should get this working before I start to get more involved and get in way over my head. Also tried your code. Same result, it does not require a dot and validates without it.
canadiancreed 13.10.2009 19:08:12
13.10.2009 19:06:49
Will this work for PHP? I ask as it looks to be a Perl module?
canadiancreed 13.10.2009 19:09:11
The Perl module just gives you an easy way of running things through that regular expression.
ceejayoz 13.10.2009 20:37:35

This is the most reasonable trade off of the spec versus real life that I have seen:


Of course, you have to remove the line breaks, and you have to update it if more top-level domains become available.

13.10.2009 19:15:34

From the page Comparing E-mail Address Validating Regular Expressions: Geert De Deckere from the Kohana project has developed a near perfect one:


But there is also a buildin function in PHP filter_var($email, FILTER_VALIDATE_EMAIL) but it seems to be under development. And there is an other serious solution: PEAR:Validate. I think the PEAR Solution is the best one.

13.10.2009 19:30:53
I've ran into some limitations of the filter_Var one (unlimited top domain sizes for one) so I'll give the PEAR one a shot. Thanks!
canadiancreed 13.10.2009 20:33:25
what are "unlimited top domain sizes"? It has come to my understanding that a tld can be up to 5 characters (.museum) and a domain can be up to 63 characters.
ty812 27.10.2009 00:01:55