Principia Cybernetica Web

Adaptive hypertext network

The frequency with which users choose particular links in a hypertext network makes it possible to let the network learn from the implicit semantic knowledge of its users, and to reorganize itself in order to better fulfill their expectations. We propose a restructuring algorithm based on the following ideas and processes:

1) the frequency of transition from one node to another indicates the strength of the semantic relation between the two nodes. Although this does not hold for some nodes like the home page and indexes, it is a very plausible rule of thumb for most nodes. In fact, the rule applies only to those links that connect nodes which represent concepts: "conceptual nodes". Nodes that provide an oversight of available nodes or bundle several concepts cannot be regarded as "conceptual". Every link between two conceptual nodes is as such assigned a frequency value that indicates the strength of the semantic relation between the two nodes or concepts it connects. This frequency is measured during a certain period of time and will then be used to modify the network's structure.

2) Transitivity: consider a certain node A that is connected to a given node B and assume that the connection between node A and B is strong, i.e. many people use the link A->B. Then, without loading your imagination too much, imagine a node C for which a strong connection B->C exists. Then the transitivity rule implies that a new link A->C should be constructed. What happens can be described as the network facilitating access to a certain node from another by bypassing two links that were otherwise necessary to reach that node. It is, in fact, expected that this rule will lead to what we describe as the dripping-effect in analogy with what happens to drops of water that glide downwards a window. A drop of rain starts to glide downwards and takes other drops along on its way down forming a small channel of water. From the moment a new connection between A and C exists, node C is more likely to be consulted and so are nodes that receive links from C. These links are thus more likely to be used more often and after some time it might turn out to be necessary to replace the links A->C, C->X with A->X and so on. After a while all related concepts should be connected by this simple rule.

3) Degradation of already existing links: links whose frequency values indicate a weak semantic relation between two nodes or concepts should be removed from the network. This can happen to links which were constructed by the transitivity rule for the wrong reasons or already existing links that are obsolete. Rule three cannot eliminate the last link to or from a node so that no node will ever be disconnected from the web.

4) Noise: to ensure the web's "creativity" random links are constructed now and then. This has obvious advantages: it avoids the network settling into a state which it can't get out of and it creates unexpected but perhaps useful connections. In the worst of all worlds it won't do any harm because the degradation of weak links assures these links to disappear after a short while.

What do we expect from such a self- restructuring network? Firstly, we expect the network to eventually assimilate the common semantics of its users by constantly restructuring itself. The problem here is that the restructuring algorithm works only within the web itself. We are still figuring out how new information should be integrated. On the other hand, once a certain concept is connected to the network at any position, it will eventually be semantically integrated if people need the information and retrieve it often enough. Perhaps the newly developed forms-software will enable us to integrate this feature.

Secondly, we expect the acquired semantic structure of the network to enhance retrieval of information from the network for human browsers as well as automated search algorithms.

Human browsers will find a network structured like they themselves have used it and will probably retrieve information faster and with greater ease. This is a presumption that can be verified. Some research has already been done on the subject of the advantage of semantically structured hypertext networks on other in terms of retrieval times (see "Hypertext, a psychological perspective" ed. C. Mc Knight, A. Dillon & J. Richardson, pub. Ellis Horwood).

Automated retrieval of information in a semantically organised hypertext network could be achieved with the principle of spreading activation. One could, for example, use the network in terms of their links and associated frequencies as a connectionist network in which after activation of certain concepts, the activations for other connected concepts are calculated as the sum of the products of the activation of neighbouring concepts and the values of their links to that particular concept. Concepts or nodes with an above threshold activation could pop up and be served to the searcher. This kind of search-algorithm would not only provide what one is searching but could provide unsolicited and related material.

Thirdly, but this is very hypothetical, if we can assume that the final structure of the network resembles that of its users, we could use the structure of the network as a tool for idea-identification. We could derive the semantic structure of a certain piece of knowledge or idea and compare it to the network to see where it fits in. Variations of meaning could be identified by comparing links and nodes of an idea to similar nodes and their links in the network. This also assumes that our network covers the knowledge and concepts expressed in these ideas. You could for example not compare the structure of an idea in the field of agriculture to the structure of our semantically formed Principia Cybernetica network.

In a first stage, we are planning to construct a test-network that links 100-200 English nouns. Initially all links would be randomised, then we would start the network and let people log in on it. After a while, connections in the network should get re-routed and the noun network should organise itself so that semantically related nouns are indeed linked in the network. Then we could use the network to investigate how it performs in browsing and spreading activation search tasks. If this proves to be successful, we might in a second stage consider porting the system to a network that does not consist of nouns but of nodes of text and bundled meanings. In a third and final stage we could perhaps implement the system on the PCP-web and see what happens there. This, of course depends on our successes in the previous experiments.


Copyright© 1994 Principia Cybernetica - Referencing this page

Author
J. Bollen, F. Heylighen,

Date
Aug 2, 1994 (modified)
1 Mar 1994 (created)

Home

Project Organization

Collaborative Knowledge Development

PCP Research on Intelligent Webs

Learning Webs

Up
Prev. Next
Down



Discussion

Add comment...