Aglets: A good idea for spidering?

  • Jason Haines ,
  • Brendan Humphreys ,
  • Chris Johnson Nick Craswell ,
  • Paul Thistlewaite ,

4th IDEA Workshop Proceedings |

\urlhttp://research.microsoft.com/users/nickcr/pubs/craswell_idea97.pdf

Many individuals and businesses now rely on the Web for promulgating and finding information, and in particular, rely on centralised search databases. The extent to which these databases reflect the “contents” of the Web in an accurate and timely manner is now under considerable doubt, and in any event, it is apparent that the methods used by the search engines for finding new and modified Web documents are not scaling well. To ameliorate these problems, we have been exploring the use of a “data push” model for notifying Web changes, to replace the current “data pull” model, which uses aglets (aka servlets or peerlets) to distribute the indexing task. Aglets are objects with a thread of control, that can migrate autonomously between processors in a distributed environment. As currently proposed and implemented, aglets have very few of the properties of Persistence. But as they inhabit a similar conceptual space, their properties and applications present some interest to the persistence community. Aglets have unique identities, locally persistent data and methods (in the sense that the aglet can be deactivated onto disk and reactivated), and self-migration in a distributed environment, but they do not facilitate a universal name space. They appear to be modelled on a belief that the processors in the network, while needing to be sufficiently transparent to allow some cooperative processing, will remain sufficiently opaque to thwart a realisation of the Persistence Ideals.