Web Tables

The Web Tables project at Google (e.g., Commun. ACM 54(2), 2011) "compiles a huge collection of databases by crawling the Web to find small relational databases expressed using the HTML table tag".

During summer 2012, I was working in Alon Halevy's Structured Databases group on the problem of ranking Web Tables for keyword queries, under the assumption that additional knowledge about the relationship between different tables is available. For instance, different tables in a corpus may be (approximate) copies of each other, translations, older or newer versions of the same data, subsets of a larger logical table, etc.