Untitled Document

Unidimensional scaling (UDS) (What is seriation ?)

This seriation method is fairly analogous to the homonymous statistical method (Hubert et Arabie 1986).
It searches for the permutation of rows and columns from D which minimizes the following criterion:

where a is a multiplicative scaling constant and the permutation.

In PermutMatrix, we propose two heuristics, one combinatory and one numerical.

The first one is greedy and optimizes the criterion by iteratively maximizing the gradient.
At each step, an inversion of two elements is sought so that the criterion will be as improved as possible.
This rough heuristic quickly yields a solution that is often close to the global optimum.
This is usually sufficient for exploratory work conducted in PermutMatrix.

However, in this algorithm, calculations at each stage are long O(n2) and the number of stage indefined.
This algorithm could be impracticable for large dataset but it can be very useful for small data set or for intra-class reorganization in a clustering tree not totally ramified (see further).

The second heuristic is numerical.
In the first step, it involves searching for the best unidimensional representation of dissimilarities via the method of Torgerson (1958).
If D is computed with usual Euclidian metrics, then this method consists of finding the first principal component of the dataset.
This is equivalent to defining a coordinate system x in such a way that the criterion

is minimum.
The approximate seriation is the enumeration of elements in the order of projections on the first principal component of the dataset.