For a few years now, I’ve owned gurganus.name. Over this year, I’ve been changing my e-mail usage to use the e-mail address I have through it. This can be problematic though. Here’s what I occasionally run into. Some sites validate e-mail addresses in various ways. Generally, validating input is a good thing. It’s not a good thing when it is done wrong though. What’s being done wrong? Well, I often see the validation being done with a regular expression that restricts the top level domain to being two or three letters. This is naive and wrong. If you look at the list of top level domains managed by the Internet Assigned Names Authority (IANA), you’ll find many two and three letter domain names, but you will also find the “museum” top level domain. Even if you included a check for every top level domain’s length, you would end up with a system that will break when a new top level domain is created.
What should you do instead?
The better approach is simply to follow the e-mail address format specified in addr-spec production of RFC2822. This doesn’t put a length on the domain name. You’ll find out that part is invalid when you look up the domain name in order to send a message.