On April 7th, Google launched a new version of Gmail for mobile for iPhone and Android-powered devices built on HTML5. We shared the behind-the-scenes story through this blog and decided to share more of our learnings in a brief series of follow up blog posts. In the last two posts, we covered everything you need to know in order to make effective use of AppCache. This week, we'll be having some fun and trying to disect the inner workings of AppCache by looking at the database it uses to store its resources. Before we start though, I'd like to point out one tip that can be useful while debugging AppCache related problems on the iPhone: how to clear your Application Cache.

There are two steps involved here:
  1. Go to Settings->Safari and tap "Clear Cache".
  2. Open Safari and terminate it by holding down the home screen button for 10 seconds.
The first step is pretty intuative. The problem is that the browser does not immediately notice that the cache has been cleared, and so needs to be restarted.

Now onto the fun stuff! There is no way to inspect the application cache from within the browser. However, if you are using the iPhone Simulator to develop your webapp, you can find all of the Application Cache resources and metadata in a SQLite3 database located at:

~/Library/Application Support/iPhone Simulator/User/Library/Caches/com.apple.WebAppCache/ApplicationCache.db

Let's have a look at what is contained within this database.

$ sqlite3 ApplicationCache.db
SQLite version 3.4.0
Enter ".help" for instructions
sqlite> .mode column
sqlite> .headers on
sqlite> .tables
CacheEntries CacheResourceData CacheWhitelistURLs FallbackURLs
CacheGroups CacheResources Caches SchemaVersion
sqlite> select * from CacheGroups;
id manifestHostHash manifestURL newestCache
---------- ---------------- ------------------------------------------------- -----------
1 906983082 http://mail.google.com/mail/s/?v=ma&name=sm 1

The CacheGroups table holds an overview of what manifests exist. A manifest is identified by its URL and the manifestHostHash is used to track when the contents of a manifest changes.

sqlite> select * from Caches;
id cacheGroup
---------- ----------
1 1

Here you can see that I have only one cache in my database. If the Application Cache was in the middle of updating the cache, there would be a second entry listed here for the cache currently being downloaded.

sqlite> select * from CacheEntries limit 1;
cache type resource
---------- ---------- ----------
1 2 1

The CacheEntries table holds a one-to-many relationship between a cache and resources within it.

sqlite> select * from CacheResources where id=1;
id url statusCode responseURL
---------- ------------------------------------------- ---------- -----------------------
1 http://mail.google.com/mail/s/?v=ma&name=sm 200 http://mail.google.c...

mimeType textEncodingName headers data
------------------- ---------------- --------------
text/cache-manifest utf-8

sqlite> .schema CacheResourceData
CREATE TABLE CacheResourceData (id INTEGER PRIMARY KEY AUTOINCREMENT, data BLOB);

This shows what information is stored for each resources listed in a manifest. As you can see the CacheResources and CacheResourceData tables contain everything that is needed in order to simulate a network response to a request for a resource. Let's see exactly what is stored in the database for Mobile Gmail's database.

sqlite> select type,url,mimeType,statusCode from CacheEntries,CacheResources where resource=id;
type url mimeType statusCode
---------- ---------------------------------------------- ------------------- ----------
2 http://mail.google.com/mail/s/?v=ma&name=sm text/cache-manifest 200
4 http://mail.google.com/mail/images/xls.gif image/gif 200
4 http://mail.google.com/mail/images/pdf.gif image/gif 200
4 http://mail.google.com/mail/images/ppt.gif image/gif 200
4 http://mail.google.com/mail/images/sound.gif image/gif 200
4 http://mail.google.com/mail/images/doc.gif image/gif 200
4 http://mail.google.com/mail/images/graphic.gif image/gif 200
1 http://mail.google.com/mail/s text/html 200
4 http://mail.google.com/mail/images/generic.gif image/gif 200
4 http://mail.google.com/mail/images/zip.gif image/gif 200
4 http://mail.google.com/mail/images/html2.gif image/gif 200
4 http://mail.google.com/mail/images/txt.gif image/gif 200

From this list, it is fairly easy to see what the meaning of the type field is. A host page has type 1, a manifest has type 2, and a normal resource has type 4. In order to know whether or not to load a page using AppCache, the browser checks this list to see if there is a URL of type 1 within it.

This concudes our three-part series on HTML5's Application Cache. Stay tuned for the next post where we will explore other areas of how we use HTML5 in Gmail. And just another reminder that we'll be at Google I/O, May 27-28 in San Francisco presenting a session on how we use HTML5. We'll also be available at the Developer Sandbox, looking forward to meeting you in person.

References
The HTML5 working draft: http://dev.w3.org/html5/spec/Overview.html

Apple's MobileSafari documentation: http://developer.apple.com/webapps/docs/documentation/AppleApplications/Reference/SafariJSRef/DOMApplicationCache/DOMApplicationCache.html

Webkit Source Code: http://trac.webkit.org/browser/trunk/WebCore/loader/appcache