Speaker :
 Professor Fushing Hsieh(University of California, Davis) 
Title :

1. Computing ultrametric topology for data cloud geometry: Why are hierarchical clustering trees misleading? 
Time :
 20120718 (Wed) 10:00  12:00 
Place : 
Seminar Room 722, Institute of Mathematics (NTU Campus) 
Abstract: 
We propose a new algorithm to derive the geometric structure of a data cloud. This algorithm constructs an ultrametric space on the data starting with the knowledge of an empirical distance measure and derives an ultrametric tree on this space. It proceeds as follows. The empirical relational measure is transformed into a temperatureregulated potential defined on the links between the nodes. Based on this potential, we extract at very low temperature a collection of motifs, which become building blocks for growing clusters via datadriven merging dynamics as temperature is being raised slowly. A series of phase transitions on this merging dynamics is identified at a series of critical temperatures. These temperatures are then taken as energy barrier heights to define an ultrametric topology onto the data cloud. This topology provides measurable and natural distances between clusters. We call the hierarchy of clustering configurations a data cloud geometry (DCG), which can be represented by an ultrametric tree or a Parisi matrix. We have compared the trees generated with this new algorithm to equivalent trees derived with the Hierarchical Clustering (HC) method on simulated as well as real data clouds from fMRI brain connectivity studies, cancer genomics, and giraffe's social network analyses. In each case, we have shown that the DCG trees are more robust and provide a better quantification of the multiscale geometric structures of the data.
Discussion on DCG tree’s applications.
Several applications of DCGtree within domain of combinatorial optimization are discussed. Examples including: Traveling Salesman Problem and highdimensional repression problems. 