Authors
A.R. Cohen
P.M.B. Vitanyi
Date (dd-mm-yyyy)
2015-02-20
Title
Web Similarity
Publication Year
2015-02-20
Number of pages
25
Publisher
v1ArXiv
Document type
Preprint
Abstract
Normalized web distance (NWD) is a similarity or normalized semantic distance based on the World Wide Web or any other large electronic database, for instance Wikipedia, and a search engine that returns reliable aggregate page counts. For sets of search terms the NWD gives a similarity on a scale from 0 (identical) to 1 (completely different). The NWD approximates the similarity according to all (upper semi)computable properties. We develop the theory and give applications. The derivation of the NWD method is based on Kolmogorov complexity.
URL
go to publisher's site
Note
Version v2 (2020) also available on ArXiv, with title: Web Similarity in Sets of Search Terms using Database Queries
Permalink
https://hdl.handle.net/11245.1/9aa5d440-a31c-49fe-9500-ae86bdd7bf55
Downloads