- Info
MCS
Support for "knowledge condensation" of mailing list archives. (1998-2000)
Participants
|
- CSDL: Robert Brewer
- Affiliates: SUN, NSF
|
Summary
|
Electronic mailing lists are popular Internet information sources. Many mailing lists maintain an archive of all messages sent to the list which is often searchable using keywords. While useful, these archives suffer from the fact that they include all messages sent to the list. Because they include all messages, the ability of users to rapidly find the information they want in the archive is hampered. Searches for specific information often return pages of irrelevant or simply useless messages.
To solve the problems inherent in current mailing list archives, we
propose a process called condensation whereby one can strip out all the
extraneous, conversational aspects of the data stream leaving only the
pearls of interconnected wisdom. MCS will allow certain types of
mailing list communities to create condensed archives which are far
more useful than traditional archives for searching. By taking into
account domain-specific aspects of the mailing list, searching can be
improved even more.
The condensation process is performed by a human editor (assisted by a
tool), not an AI system. While this adds a certain amount of overhead
to the maintenance of the MCS-generated archive when compared to a
traditional archive, it makes the system implementation feasible.
The field site chosen for this research is the "jcvs" mailing list
for users of the jCVS program (a Java client for the Concurrent
Versions System).
A case study approach was used, in which the jcvs mailing list was
be studied to determine the appropriate forms of representation and
automated mechanisms. Once implemented, these mechanisms were deployed
on the jcvs mailing list and surveys of archive users were used to
evaluate the system.
We have condensed a 1428 message mailing list archive to an archive
containing only 177 messages (an 88% reduction). The condensation
required only 1.5 minutes of editor effort per message. The condensed
archive was adopted by the users of the mailing list.
|
Software
|
The software is no longer available.
|
Publications
|
Available at the MCS Publications Area.
|
Status
|
Started Spring, 1998. Initial evaluation and thesis writeup completed in March, 2000.
|
Keywords
|
Knowledge condensation, mailing lists, archives
|