Information theory (IT) is a foundational subject for computer scientists, engineers, statisticians, data miners, biologists, and cognitive scientists. Unfortunately, PHP currently lacks software tools that would make it easy to explore and/or use IT concepts and methods to solve data analytic problems. This two-part series aims to remedy this situation by:
Introducing you to foundational information theory concepts.
Implementing these foundational concepts as classes using PHP and SQL.
Using these classes to mine web data.
This introduction will focus on forging theoretical and practical connections between information theory and database theory. An appreciation of these linkages opens up the possibility of using information theory concepts as a foundation for the design of data mining tools. We will take the first steps down that path.
There is a follow up piece to this here (http://www.onlamp.com/pub/a/php/2005/03/24/joint_entropy.html). The code looks pretty solid. I may take some time and port this code to python. It might be useful for analyzing the schedule data.