The Large Scale Distribution project is centered around issues involved in
making an interconnected set of computers aid groups of users in the
performance of their work. The group is structured as an inter-related set of
projects, each of which addresses some problems of scaling, partial failure,
and the presentation of a comprehensible user model.
Objective for FY95
During the 1995 fiscal year, the Large Scale Distribution group investigated
problems concerning
Techniques for reliable distributed computing;
ATM transport for object invocations;
Email-enabled services;
tools for distributed video application construction; and
multi-screen X servers.
Description
The availability of large numbers of computers connected via a network opens
up the possibility of utilizing all of the computational resources of an
enterprise as a single, distributed system. However, there is little
experience with distributed systems on this scale. In particular, such systems
will be large enough that at any given time some component of the system is
almost certainly going to fail, requiring that designers of protocols used by
computational entities within such a system must design with failure in
mind. Additionally, such systems open up the possibility of completely new
ways of using the computing resources of an enterprise, allowing new ways for
the people within the network to interact. Finally, such a system is different
in kind from the stand-alone systems that users have grown used to; the user
model of the enterprise-wide distributed system is not a simple extension of
the individuals desktop.
The various projects undertaken as part of the Large Scale Distribution
project all center around one or more of these problems. The protocols
designed and implemented in the reliable distributed computing investigation
all deal explicitly with the problems of failure in large systems. The effort
to base object invocations on ATM transport centered on investigating the
impact of new networking technologies on the development of distributed
application infrastructures. Email-enabled services allow us to study how the
computational infrastructure of an enterprise can be used to provide basic
services and cope with work-flow applications. The distributed video
application toolkit is an attempt to provide users with tools to build
applications that will utilize the network in new and different ways. Finally,
the multi-screen X server is a way of increasing the amount of screen real
estate available to the user, enabling experimentation into new user
interface styles that may help in reflecting the enterprise-wide distributed
system.
Accomplishments
Reliable Distributed Computing
In the area of reliable distributed computing, we concentrated on the design
of protocols that would guard against the problems introduced in a distributed
system by the lack of global knowledge and the possibility of partial
failure. Assuming a distributed system made up of interacting objects
specified by an Interface Definition Language, we specified interfaces for
transactions, allowing several objects to cooperate on a set of
operations in such a way that the group of operations would either all be
performed or none would be performed;
event notifications, allowing objects to register interest in the
occurrence of actions within other objects and then getting a notification
when those actions happened; and
activation, which enables conserving the overall resources of a
distributed system by allowing objects to minimize the set of resources they
use when the objects are not required yet automatically return to a fully
active state when needed.
Along with the specification of the protocols, we produced sample
implementations of all of the protocols showing that such implementations were
possible and that the performance of distributed object systems using the
protocols was acceptable. The work led to the publication of two papers, and
the code and interfaces produced as part of the transaction investigation has
already been picked up for use by GTE laboratories for their distributed
object system.
ATM transport for object invocations
The ATMObjects project built a datagram-based transport for the network
objects system. We wound up implementing a fairly general purpose message
protocol. The protocol was not specific to RPC, but could be used to support a
general purpose reliable message passing protocol for a variety of distributed
computing purposes. The protocol was derived from a protocol devised by Liskov
et. al. that used loosely synchronized clocks for connection management. This
protocol allows for efficient connection management without the direct
exchange of connection establishment messages.
The current implementation runs on top of UDP. We wound up not targeting the
ATM network layer directly for a number of reasons. The first reason was the
lack of stability of the local ATM network. A second reason was that the local
ATM network became dedicated to testing and use by the Parallel Open Systems
Group for an extended period of time. A final reason was the lack of
documentation and local expertise on writing or using ATM drivers.
The longer term goal of investigating fast RPC protocols was met. The
protocol that was implemented required a minimal amount of overhead in
order to do connection management.
Email-enabled services
During the past year, one email-enabled service was designed, implemented, and
deployed, while another was designed and implemented. The service that was
deployed utilized secure hashing functions to allow authors to register a hash
of their work with the service. The service would gather these hashes and,
once a week, a secure digest of these hashes would be published in The
Boston Globe. This allowed the minimal date of the production of the
document to be proved in the future, making the system into an electronic
notary service. The service was completely automated on the server side up to
(but not including) the placing of the classified ad in the Globe.
This lead to the design and implementation of a complete document registration
and clearance system based on electronic mail as the transport
mechanism. Meant to replace the current paper-based system used in SML, the
system allows the registration, notarization, archiving, and clearance (both
management and legal) of documents. All communication is driven by electronic
mail, and after the initial entering of a document in the system all viewing
is accomplished through the web browser of choice for the person doing the
viewing. This system will be deployed in the east coast SML over the summer,
and in all of SML sometime in Q2 of the coming year.
Video application tools
During the year, the Video application tools project has built up a set of
video services that can be manipulated using the TCL scripting language to
build small video-based applications. Using a code base imported from MIT, the
project has teased out the video functionality into a set of components that
can be recombined quickly in a number of ways, allowing the construction of
simple video-based applications with a single page of TCL code. In the process
of doing this, a number of interesting problems were encountered concerning
introducing a multi-threaded model into the original code, which was single
threaded.
Documentation and deployment of the toolkit is planned for the coming year.
Multi-screen X server
The multi-screen X server, Xvan, developed in this group two years ago,
continued to be evolved and maintained during the year. The server, which
allows up to nine physical screens to be treated by the X server as a single
bitmap, was moved to the X11R6 code base. The code has been publicly available
for over a year as contributed software to the X Consortium, and Sun customers
have developed products based on Xvan. The server is in daily use by most
members of the east coast division of SML and by a number of engineers on the
west coast. Use of the server allows experimentation into user interfaces that
require large amounts of screen real estate, which we believe will be
necessary in large distributed systems.
Recent customer requests have generated considerable interest in Xvan by the
Sun consulting organization. We are now in the process of transferring Xvan to
that organization, which will take over the maintenance of the server. SunSoft
has also shown interest in moving Xvan-like functionality into the standard
Sun distribution of the X server.
References
Publications
"Events in an RPC Based Distributed System," J. Waldo, A. Wollrath,
G. Wyant, S. Kendall, Proceedings of the 1995 Winter USENIX Conference,
January, 1995.
"Simple Activation for Distributed Objects," A. Wollrath, G. Wyant,
J. Waldo, Proceedings of the 1995 USENIX Conference on Object Oriented
Programming and Systems, June, 1995.