Summary: To solve some of the problems inherent in current mailing list archives, we propose a process called condensation whereby one can strip out all the extraneous, conversational aspects of the data stream leaving only the pearls of interconnected wisdom. The Mailing List Condensation System (MCS) will allow certain types of mailing list communities to create condensed archives which are far more useful than traditional archives for searching. By taking into account domain-specific aspects of the mailing list, searching can be improved even more.

Electronic mailing lists are popular Internet information sources. Many mailing lists maintain an archive of all messages sent to the list which is often searchable using keywords. While useful, these archives suffer from the fact that they include all messages sent to the list. Because they include all messages, the ability of users to rapidly find the information they want in the archive is hampered. Searches for specific information often return pages of irrelevant or simply useless messages.

The condensation process is performed by a human editor (assisted by a tool), not an AI system. While this adds a certain amount of overhead to the maintenance of the MCS-generated archive when compared to a traditional archive, it makes the system implementation feasible.

The field site chosen for this research is the “jcvs” mailing list for users of the jCVS program (a Java client for the Concurrent Versions System).

A case study approach was used, in which the jcvs mailing list was be studied to determine the appropriate forms of representation and automated mechanisms. Once implemented, these mechanisms were deployed on the jcvs mailing list and surveys of archive users were used to evaluate the system.

We have condensed a 1428 message mailing list archive to an archive containing only 177 messages (an 88% reduction). The condensation required only 1.5 minutes of editor effort per message. The condensed archive was adopted by the users of the mailing list.

Principal researcher(s): Robert Brewer

Status: Active development 1998 Р2000.