All posts by antic

http://adameivy.com

File Deletion & Cloud Backups + Can You Trust Google?

2009/09/10 antic

I just strolled by my RSS feeds and saw Bruce Schneier’s latest on file deletion: http://bit.ly/Mx24L. I’ve said this many times–but there’s more to be said. The core of his point is that you can never be sure when you delete a file from a web service that it will actually be deleted. After all, you most likely signed away in the terms of use that they own your data–but sometimes, you can protect your data before it gets to that point.

I actually use Google for a lot of things. The convenience far outweighs the violation of privacy for me. This is because, while I take steps to protect some of my data, I don’t really care about privacy for most of my data in the cloud. I don’t care if someone reads 99% of my emails–most of my communications are in public forums these days anyway (Twitter, facebook status updates, etc), which have so little privacy it’s laughable.

Gmail

I use Gmail to check and manage all of my email accounts. I have several different servers/accounts with various standards of spam filtering and accessibility (and all with different interfaces). Gmail lets me combine them all into one, adding the best email search and online interface I’ve ever used.
I used to use Thunderbird, manage meticulous backups myself, port around my latest email repository on a massive thumbdrive. But then I realized something: it’s a pain to manage all of my email locally, especially when I want to be able to access it from any machine (or my iPhone). And, at the same time, I wasn’t really providing myself any extra security–in fact, I was increasing the insecurity of my email. Here’s why:

All email goes through several parties anyway before it gets to you. Even if you manage your email locally, it resides in someone’s outbox, their backups, your email server and it’s backups + anyone who intercepted it along the way between any of those points. But now that you are managing a local copy, you’ve added more locations that your data can be compromised: now it’s at your house, on your thumbdrive, in your personal backups–and maybe, as I once had it, checked into a subversion repository and checked out on several different machines at home and work.

People always give me Gmail’s adwords parsing of your email data as an example of why Gmail is creepy and untrustworthy but if you are really concerned about it, use PGP/GPG (Tutorial + Download or FirePG, firefox plugin). Now that’s easier said than done. The average lay-user is not going to be comfortable encrypting their emails before sending them and decrypting them before reading them–but if you are really concerned about your privacy, it’s a small price to pay–and the FirePG firefox plugin integrates beautifully with Gmail, allowing a seamless reading of encrypted email.

Once you have PGP/GPG setup, get my key and send me an encrypted message :)

Google Docs

This is a slightly different situation from Gmail because until I put some of my documents on Google Docs, the only place they existed was in my personal subversion repository and the thumbdrive that I checked them out on. Now, I’m faced with a little bit of a problem:

If I want to store information in Google Docs that I don’t want someone to read, I have to encrypt it. This works well for standard Google Documents but it can’t be used on Google Spreadsheets. This ends up being a security lag point and I’m constantly reminding myself not to make the spreadsheet transition yet. Even managing the Word-like Document system with encryption can be tricky, since you have to encrypt the data before you paste it into your Google Document, lest they auto-save an unencrypted copy and you miss all opportunity to keep that data private (once it’s saved, it’s in their backups and you can’t ever be sure it’s gone).

The thing to remember about online data storage is this: if you downloaded it from a server somewhere, it’s already in the cloud–you might be lucky and be able to delete it but you can never be certain that it isn’t sitting in someone’s backup or cache of intercepted traffic. But if you have private documents, messages, etc that you want to store on the cloud or send to someone over the internet, the only way you can be sure your data is safe is if you encrypt it before it goes into any outbox, inbox, or text area of any kind in a web browser. In the end, security is only as good as the weakest link: if you are communicating/sharing data with someone else, make sure they are as concerned as you are–or they might just copy and paste your data into their unencrypted cloud backup system.

follow on Twitter

PHP, Regular Expressions, Tutorials

Email Validation Using Regular Expressions (the right way… really)

2009/09/04 antic

OK, I know this is the millionth blog post claiming to have the right way to validate email addresses, but here me out:

Regular expressions are awesome–and I mean Wrath of Kaaaaahn! awesome. They yield the unholy power to make or break your system, to secure or rip apart your entry points. Woven wisely, they can be magical. Woven foolishly, they can destroy you.

After spending way too long sifting through broken regular expressions, expecting that someone, somewhere has solved the obvious need for the ultimate email validation RegEx, I gave up on searching and created my own. Yes, you may have solved it too, but searching google for ’email validation regular expression’ gives some very poor answers. Look at all the variations on RegexLib.

I do recommend reading Regular-Expressions.info’s take on email regex. They make great points about trade-offs when using the RFC spec for emails. But I’m still not happy with their ‘practical’ email regex:

[a-z0-9!#$%&’*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&’*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

— from regular-expressions.info

This version may work for you, but I have two issues with it:
1. It supports characters that most other websites will reject as invalid–even though they are RFC compliant (e.g. `@asdf.com)
2. It’s overly complex for what it’s checking.

I have a simpler version, which doesn’t support everything that the above regex supports (though, you could add the extra characters if you like), but it does support everything I consider to be reasonable for an email address while being light, fast, and easy to understand. It also simplifies much of the regex by shortening collections like [a-zA-Z0-9_] by using the RegEx shortcut for word characters ([\w]) instead.

My Regular Expression Breakdown

Here it is:

/^[\w.%&=+$#!-‘]+@[\w.-]+\.[a-zA-Z]{2,4}$/

/^ – the beginning of the string (must not have a newline character or some other invalid character before the email address starts)
[\w.%&=+$#!-‘]+ – the username portion, which allows any word character a-z, 0-9, underscore (_) but also a few other characters. You can pick and choose what you want to remove from this list but keep in mind that some people really do create crazy emails–and remember that spam bots tend to use emails like skiwytyru32hh@mail.ru, which will pass any validator (unless you are checking emails against a spam database on the back-end).
@ – gotta have this (and only 1)
[\w.-]+ – the domain or subdomain + domain (asdf.com, or sub.asdf.com)
\.[a-zA-Z]{2,4} – the top level domain (.com, .co.uk, .info)
$/ – the end of the string (must end once it’s considered valid (we don’t have trailing spaces, line breaks or other data)

The below functions are documented using PHPDocs with valid test data and known unsupported emails. They also contain codesnippit GUID identifiers, which I recommend for all code snippits (read Jeff Atwood’s “A Modest Proposal for the Copy and Paste School of Code Reuse” for more info)

JavaScript implementation (not for security!):

/**
 * validEmail checks for a single valid email
 *
 * supported RFC valid email addresses (test data):
 * a@a.com
 * A_B@A.co.uk
 * a@subdomain.adomain.com
 * abc.123@a.net
 * O'Connor@a.net
 * 12+34-5+1=42@a.org
 * me&mywife@a.co.uk
 * root!@a_b.com
 * _______@a-b.la
 * %&=+.$#!-@a.com
 *
 * Current known unsupported (but are RFC valid):
 * abc+mailbox/department=shipping@example.com
 *  !#$%&'*+-/=?^_`.{|}~@example.com (all of these characters are RFC compliant)
 * "abc@def"@example.com (anything goes inside quotation marks)
 * "Fred \"quota\" Bloggs"@example.com (however, quotes need escaping)
 *
 * @param string email The supposed email address to validate
 * @return bool valid
 * @author Adam Eivy
 * @version 2.1
 * @codesnippit bcd71ab9-dc05-45af-9855-abb57c0cf0ab
 */
function validEmail(email){
   var re = /^[\w.%&=+$#!-']+@[\w.-]+\.[a-zA-Z]{2,4}$/;
   return re.test(email);
}

Click to copy:

/**
 * validEmail checks for a single valid email
 *
 * supported RFC valid email addresses (test data):
 * a@a.com
 * A_B@A.co.uk
 * a@subdomain.adomain.com
 * abc.123@a.net
 * O'Connor@a.net
 * 12+34-5+1=42@a.org
 * me&mywife@a.co.uk
 * root!@a_b.com
 * _______@a-b.la
 * %&=+.$#!-@a.com
 *
 * Current known unsupported (but are RFC valid):
 * abc+mailbox/department=shipping@example.com
 *  !#$%&'*+-/=?^_`.{|}~@example.com (all of these characters are RFC compliant)
 * "abc@def"@example.com (anything goes inside quotation marks)
 * "Fred \"quota\" Bloggs"@example.com (however, quotes need escaping)
 *
 * @param string email The supposed email address to validate
 * @return bool valid
 * @author Adam Eivy
 * @version 2.1
 * @codesnippit bcd71ab9-dc05-45af-9855-abb57c0cf0ab
 */
function validEmail(email){
   var re = /^[\w.%&=+$#!-']+@[\w.-]+\.[a-zA-Z]{2,4}$/;
   return re.test(email);
}

PHP implementation:

I use the same regular expression in PHP, but PHP also allows us to check the domain for a mailserver, which has a two part advantage:
1. we can be sure that the domain exists
2. we can be sure that the domain has a mail server setup on it (using PHP checkdnsrr)

This is where the real validation happens. Remember: You CANNOT trust client side code. This means, you can never assume that your JavaScript code validated your email address. It’s a nicety for the user but not a security checkpoint. You need to check this stuff on the back end, always.

/**
 * validEmail checks for a single valid email
 *
 * supported RFC valid email addresses (test data):
 * a@a.com
 * A_B@A.co.uk
 * a@subdomain.adomain.com
 * abc.123@a.net
 * O'Connor@a.net
 * 12+34-5+1=42@a.org
 * me&mywife@a.co.uk
 * root!@a_b.com
 * _______@a-b.la
 * %&=+.$#!-@a.com
 *
 * Current known unsupported (but are RFC valid):
 * abc+mailbox/department=shipping@example.com
 *  !#$%&'*+-/=?^_`.{|}~@example.com (all of these characters are RFC compliant)
 * "abc@def"@example.com (anything goes inside quotation marks)
 * "Fred \"quota\" Bloggs"@example.com (however, quotes need escaping)
 *
 * @param string $email The supposed email address to validate
 * @param bool $validateDomain (default true): ping the domain for a valid mailserver
 * @return bool valid
 * @author Adam Eivy
 * @version 2.1
 * @codesnippit fa5a06bf-2bce-41a8-a2e0-2f6db7dd22f9
 */
function validEmail($email,$validateDomain=true){
   if(preg_match('/^[\w.%&=+$#!-\']+@[\w.-]+\.[a-zA-Z]{2,4}$/' , $email)) {
      if(!$validateDomain)   return true; // not testing mailserver but regex passed
      // now test mail server on supplied domain
      list($username,$domain)=split('@',$email);
      if(checkdnsrr($domain,'MX')) return true; // domain has mail record
   }
   return false; // either failed to match regex or mailserver check failed
}

Click to copy:

/**
 * validEmail checks for a single valid email
 *
 * supported RFC valid email addresses (test data):
 * a@a.com
 * A_B@A.co.uk
 * a@subdomain.adomain.com
 * abc.123@a.net
 * O'Connor@a.net
 * 12+34-5+1=42@a.org
 * me&mywife@a.co.uk
 * root!@a_b.com
 * _______@a-b.la
 * %&=+.$#!-@a.com
 *
 * Current known unsupported (but are RFC valid):
 * abc+mailbox/department=shipping@example.com
 *  !#$%&'*+-/=?^_`.{|}~@example.com (all of these characters are RFC compliant)
 * "abc@def"@example.com (anything goes inside quotation marks)
 * "Fred \"quota\" Bloggs"@example.com (however, quotes need escaping)
 *
 * @param string $email The supposed email address to validate
 * @param bool $validateDomain (default true): ping the domain for a valid mailserver
 * @return bool valid
 * @author Adam Eivy
 * @version 2.1
 * @codesnippit fa5a06bf-2bce-41a8-a2e0-2f6db7dd22f9
 */
function validEmail($email,$validateDomain=true){
   if(preg_match('/^[\w.%&=+$#!-\']+@[\w.-]+\.[a-zA-Z]{2,4}$/' , $email)) {
      if(!$validateDomain)   return true; // not testing mailserver but regex passed
      // now test mail server on supplied domain
      list($username,$domain)=split('@',$email);
      if(checkdnsrr($domain,'MX')) return true; // domain has mail record
   }
   return false; // either failed to match regex or mailserver check failed
}

follow on Twitter

JavaScript, jQuery, Tutorials

Using jQuery to Handle Dynamic Default Text in Form Fields

2009/09/01 antic

Download a working demo here (.zip)

It’s very common to want to add default text to a textarea or text input field and have that text vanish when a user clicks to edit the field. When the user clicks away from the field, if the field is still empty, we should bring back the default text. This is easy to do using the jQuery JavaScript Library.

The Code

HTML

We start by adding a ‘defaultText’ class to our textarea and input fields that will contain default text–and adding a title attribute with the default text for these elements:

<form name="someForm">
<input type="text" name="user" class="defaultText" value="" title="enter your name..." />
<textarea name="bio" class="defaultText" title="enter your bio..."></textarea>
</form>
<!-- don't forget to include jquery -->
<script type="text/javascript" 
src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js">
</script>

JavaScript

Now we use jQuery to find all elements with the ‘defaultText’ class and add the focus and blur handlers:

/**
 * Setup handlers on elements when the document is ready 
 */
$(document).ready( function(){
   // add the focus handler to all elements with a class of 'defaultText'
   $(".defaultText").focus( function(e){
      // if the field contains the default text, clear it
      if(this.value == $(this).attr('title')) {
         this.value = '';
      }
      $(this).removeClass('defaultText'); // remove special styling
   }).blur( function(e){ // add the blur handler
      // if the user leaves the field without entering anything, bring back the default text
      if(this.value == ""){
         $(this).addClass('defaultText'); // bring style back
         this.value = $(this).attr('title');
      }
   }).blur(); // invoke the blur handler, which copies the title to the value
});

CSS

Don’t forget to style your defaultText class so that it looks a little different from user supplied text:

.defaultText {
    font-size:12px;
    color:#999;
}

Download a working demo here (.zip)

follow on Twitter

Intellectual Pirates

All posts by antic

Email Validation Using Regular Expressions (the right way… really)

My Regular Expression Breakdown

JavaScript implementation (not for security!):

PHP implementation:

Using jQuery to Handle Dynamic Default Text in Form Fields

The Code

HTML

JavaScript

CSS

Web Development Courses, Rants, Tutorials and Hacks