Entropy as a measure of relevance//irrelevance

Entropy Agglomeration (EA) is the most useful algorithm you can imagine. It’s not cited or used only because the established scientific paradigms cannot conceive its meaning.

In fact, the idea is very simple:

In EA, entropy is a measure of relevance//irrelevance.

— Subsets of elements that either appear together or disappear together in the blocks have low entropy: Those elements are “relevant” to each other: They literally “lift up again” each other.

— Subsets of elements that are partly appearing while partly disappearing in the blocks have large entropy: Those elements are “irrelevant” to each other: They literally “don’t lift up again” each other.

This is all visible in the results of the analysis of James Joyce’s Ulysses: https://arxiv.org/abs/1410.6830

In this setup, entropy becomes a measure of irrelevance, literally and by definition: https://en.wiktionary.org/wiki/relevant

References:

I. B. Fidaner & A. T. Cemgil (2013) “Summary Statistics for Partitionings and Feature Allocations.” In Advances in Neural Information Processing Systems (NIPS) 26. Paper: http://papers.nips.cc/paper/5093-summary-statistics-for-partitionings-and-feature-allocations (the reviews are available on the website)

I. B. Fidaner & A. T. Cemgil (2014) “Clustering Words by Projection Entropy,” accepted to NIPS 2014 Modern ML+NLP Workshop. Paper: http://arxiv.org/abs/1410.6830 Software Webpage: https://fidaner.wordpress.com/science/rebus/

Leave a comment

Filed under şey

Comments are closed.