Exploring Customization of Distributed Systems using COM

 

Yi-Min Wang

Pi-Yu Emerald Chung

 
 

Microsoft Research

Bell Labs Research

 
 

Microsoft Corporation

Lucent Technologies

 
 

Redmond, Washington

Murray Hill, New Jersey

 

Abstract

Component Object Model (COM) specifies a binary standard for object interaction and supports remote object invocation. In this paper, we give a brief introduction to COM and focus on describing the features that make COM an attractive platform for research and development. These include transparency, extensibility, indirection, versioning, and server lifetime management.

  1. Introduction
  2. Component Object Model (COM) [COM95] specifies an architecture, a binary standard, and a supporting infrastructure for building, using, and evolving component-based applications. It extends the benefits of object-oriented programming such as encapsulation, polymorphism, and software reuse to a dynamic and cross-process setting. Distributed COM (DCOM) [Brown98] is the distributed extension of COM. It specifies the additional infrastructure that is required to further extend the benefits to networked environments.

    Distributed computing is becoming a mainstream due to the advance in high-speed networking and the explosive growth of the Internet. Object-oriented programming has become a dominating programming paradigm for developing reusable software. Distributed objects combine the two trends and are becoming increasingly popular. More and more software systems are being build as distributed object applications and they often share a number of common goals. The main objective of this paper is to identify the main features of COM/DCOM, which greatly facilitates achieving the common goals. Such features include the separation of interfaces and implementations, support for objects with multiple interfaces, language neutrality, run-time binary software reuse, location transparency, architecture for extensibility, support for indirection, approach to versioning, and different styles of server lifetime management. The argument is that, by using COM/DCOM as a platform for building distributed object applications, researchers and developers can concentrate on important issues specific to their applications without having to devote a significant portion of their efforts to building the supporting infrastructure.

  3. COM Basics
    1. Object model
    2. The separation of interface and implementation is at the core of COM. An interface is a collection of functionally related abstract methods, and is identified by a 128-bit globally unique identifier (GUID) called the interface ID (IID). In contrast, an object class is a concrete implementation of one or more interfaces, and is also identified by a GUID called the class ID (CLSID). The use of GUIDs allows programmers to independently generate unique IDs without requiring central authority. An object instance (or object) is an instantiation of some object class. An object server is a dynamic link library (DLL) or an executable (EXE) capable of creating object instances of potentially multiple classes. A client is a process that invokes methods of an object.

      On Windows NT, a lot of COM-related information is stored in the registry under HKEY_CLASSES_ROOT. The information can be viewed and modified from the registry editor regedt32.exe. All registered object classes can be found under the CLSID key and all registered interfaces can be found under the Interface key. For example, "Microsoft Word Document" is an object class with CLSID {00020900-0000-0000-C000-000000000046} (or {00020906-0000-0000-C000-000000000046}). The LocalServer32 registry subkey indicates that its associated implementation filename is winword.exe. This class supports multiple interfaces including an IDataObject interface with IID {0000010E-0000-0000-C000-000000000046}. IDataObject is an interface enabling data transfer and notification of changes in data. It is also supported by the object class "Microsoft PowerPoint Slide" with CLSID {EA7BAE71-FB3B-11CD-A903-00AA00510EA3} (or {64818D11-4F9B-11CF-86EA-00AA00B929E8}) and implementation file PowerPnt.exe.

    3. Binary interface standard
    4. COM specifies a binary standard for interfaces to ensure dynamic interoperability of binary objects possibly built using different programming languages. Specifically, any COM interface must satisfy two requirements. First, its instantiation must follow a standard memory layout, which is the same as the C++ virtual function table [Box98]. In other words, a COM interface pointer is a pointer to a pointer that points to an array of virtual function pointers. Second, any COM interface must inherit from the IUnknown interface so that its first three methods are (1) QueryInterface() for navigating between interfaces of the same object instance, (2) AddRef() for incrementing reference counts, and (3) Release() for decrementing reference counts.

    5. Programming model
    6. A typical client/server interaction in COM goes like this: client starts the activation phase by calling CoCreateInstance() with the CLSID of the requested object and the IID of the requested interface. It gets back an interface pointer from the call. Upon returning the interface pointer, the object calls AddRef() on itself. In the method invocation phase, the client invokes methods of the interface through the pointer as if the object resides in its own address space. When the client needs to call methods of another interface of the same object, it calls QueryInterface() on the current interface and specifies the IID of the second interface. Once it gets back a pointer to the second interface, it can invoke methods as usual. When the client finishes using either interface pointer, it calls Release() on the pointer.

    7. Remoting architecture
    8. We use the term remoting architecture [Chung97] to refer to the entire infrastructure that connects COM clients to out-of-process server objects. (See Figure 1.) The standard remoting architecture includes, among other things, (1) object proxies that act as the client-side representatives of server objects and connect directly to the client; (2) interface proxies that perform client-side data marshaling and are aggregated into object proxies; (3) client-side channel objects that use remote procedure calls (RPCs) to forward marshaled calls; (4) server-side endpoints that receive RPC requests; (5) server-side stub manager that dispatches calls to appropriate interface stubs; (6) interface stubs that perform server-side data marshaling and make actual calls on the objects; and (7) standard marshaler that marshals interface pointers into object references on the server side and unmarshals the object references on the client side. Note that interface proxies and stubs are application-specific and are generated by running an Interface Definition Language (IDL) compiler on application-supplied IDL files. The other objects are application-independent and are provided by COM.

      Figure 1. COM remoting architecture and extensibility.

    9. DCOM
    10. The DCOM wire protocol extends the remoting architecture across different machines. Currently, it is specified as a set of extensions layered on top of the DCE RPC specification [DCE95]. It adopts DCE RPC’s Network Data Representation (NDR) format for marshaling data to be transmitted across heterogeneous network. It also leverages DCE RPC’s security capabilities for authentication, authorization, and message integrity. In addition, DCOM specifies the RPC interfaces for remote server activation, ID-to-endpoint resolution, remote IUnknown method invocation, and pinging for robust reference counting [Brown98]. It also defines the data structure of object references and the DCOM-specific portion of RPC packets.

    11. Threading model

    If an application allows multiple clients to concurrently invoke methods of the same COM object, some synchronization mechanisms need to be provided to protect the data. COM introduces the concept of apartments to allow objects with different concurrency constraints to live in the same process. An apartment is a logical grouping of objects that share the same concurrency constraints. Before a thread can use COM, it must first enter an apartment by calling CoInitializeEx( ). Every COM process can have at most one multithreaded apartment (MTA), but it can contain multiple single-threaded apartments (STAs). Multiple threads can execute in an MTA concurrently, so object data in an MTA need to be properly protected. In contrast, only one thread can execute in an STA and so concurrent accesses to objects in an STA are automatically serialized.

  4. Main Features
    1. Transparency
    2. In the activation phase, COM supports both non-transparent and transparent modes. In the non-transparent mode, a client can explicitly specify whether the server component resides in a DLL or an EXE, and the remote machine name if the component is to be run remotely. Alternatively, the client can select the transparent mode and let COM consult the registry to determine such attributes. Once the machine name is determined, COM will try to attach to a running server instance, hosting the requested object class, on that machine. If none exists, COM will automatically locate the server implementation file, start a server process, and create an object instance.

      To provide call transparency in the method invocation phase, the server returns an object reference, instead of a physical connection, as the result of an activation request. An object reference encapsulates all the necessary connection information for the client to reach the server object. It typically includes machine IP address, port number, and object ID. Although a server usually returns an object reference representing an object instance that it is hosting, it can also return an object reference that it has obtained from another machine. When the object reference is shipped to the client side, the client-side COM infrastructure unmarshals it by extracting the connection information, making the connection, and returning an interface pointer to the client. When the client makes a call through the pointer, the call will be transparently routed to the object identified by the object reference, without passing through the server that was initially contacted.

      A good example of exploiting COM’s location transparency is the auto-distribution work in the Coign project [Hunt97]. Programmers write distributed component-based applications on a single machine without having to worry about how to deploy the components on their networks. Coign then instruments the binary code to perform intercommunication analysis in scenario-based profiling. Based on the analysis results, Coign can automatically partition and distribute the components to nodes on the given network to achieve efficient execution.

    3. Extensibility
    4. Usually, COM applications use standard marshaling and rely on COM to provide standard marshaling and transport. However, some applications may need to customize the client-server connection for a number of reasons. For example, a client process may wish to cache read-only data to speed up access. An application may need to run RPC on a new transport, or may require multicast or asynchronous transport that does not fit the RPC paradigm. Some applications need to integrate DCOM with third-party compression or encryption packages. Distributed shared object systems may wish to hide data consistency logic in the proxies. All these applications demand extensibility in the remoting architecture.

      Extensibility provided by COM can be divided into three categories: below, above, and within. The first category extends COM at the RPC layer and below, as shown in Figure 1. The main advantage is that it is totally transparent to the standard remoting architecture. A disadvantage is that it is only applicable to transport replacement applications. Currently, DCOM can be configured to run on either TCP, UDP, IPX, or NetBIOS by simply modifying the registry key HKEY_LOCAL_MACHINE\Software\Microsoft\Rpc. If a new transport, such as fast user-level networking, supports a compatible set of APIs, DCOM applications can be configured to run on the new transport without any changes.

      To achieve the other two types of extensibility, COM supports a custom marshaling mechanism. By implementing an IMarshal interface, a server object indicates that it wants to bypass the standard remoting architecture and supply its own custom connection. By implementing the method calls of the IMarshal interface, the object has the flexibility of creating any number of objects and connecting them in any way to serve as the custom remoting architecture. In particular, the object can construct a custom object reference and specify the CLSID of the client-side custom unmarshaler, which will be automatically instantiated by COM to receive and interpret the custom object reference and to make the custom connection.

      The second type of extensibility allows inserting a handler layer above the standard remoting architecture and below the user application. It is often called semi-custom marshaling because most of the tasks are eventually delegated to the standard remoting architecture, as shown in Figure 1. Specifically, the server object’s IMarshal implementation delegates the task of marshaling to the standard marshaler, and performs additional processing on top of that. Similarly, the custom unmarshaler’s IMarshal implementation also delegates the task of unmarshaling to the standard marshaler and builds additional logic on top of that. As part of the marshaling/unmarshaling process, a custom proxy and a custom stub are inserted to allow additional processing of each method invocation.

      The third type of extensibility is the most general one. Very often, applications want to supply a few custom objects, while reusing most of the standard marshaling objects. For example, algorithms that manipulate marshaled data streams instead of individual call parameters may want to replace channel-level objects and reuse marshaling objects. This is hard to accomplish in current COM architecture. A new componentized architecture called COMERA (COM Extensible Remoting Architecture) has been proposed to promote binary software reuse at the infrastructure level [Wang98].

    5. Indirection
    6. Many software problems can be solved by one more level of indirection. Supporting indirection is a special form of providing extensibility. In most traditional programming paradigms, offering one more level of indirection often involves tricky programming hacks that may impose certain limitations. In contrast, COM builds into its architecture the support for indirection. As demonstrated in the following discussion, activation indirection can be used for on-line software update and load balancing, while call indirection can facilitate fault tolerance and object migration.

      The first indirection occurs when a client requests the activation of a server object by specifying its CLSID. Since the mapping from a CLSID to the server executable filename is determined by registry keys such as TreatAs and LocalServer32, the same client can actually be invoking two different implementations with the same CLSID across two activations if a key value is changed. This provides the basis for on-line software update. Another indirection exists after the server receives the activation request and before it returns an interface pointer. As pointed out earlier, the server can potentially return an object reference that it has obtained from another machine. This provides the basis for load balancing. (See the broker example [Nelson97].)

      As part of the (interface pointer) marshaling and unmarshaling process, interface stubs and proxies responsible for data marshaling are dynamically loaded and created. Since these stubs and proxies are themselves COM objects, this process provides yet another indirection point. By controlling through the registry (1) the mapping from the interface IID to the CLSID of these proxy/stub objects, or (2) the mapping from such CLSID to the implementation filename, applications can choose to either reuse the standard interface proxies and stubs or supply their own custom ones.

      In standard marshaling, once an interface pointer is returned to the client, the client is bound to the server object and so it is in general harder to provide indirection for method calls. A limited form of indirection is provided by the RPC layer: if the server’s IP address is failed-over to another machine and the original connection is broken, the RPC layer will retry the connection and get redirected to the new machine. The next-generation COM+ runtime and services [Kirtland97][COM+98] will provide a general mechanism called interceptors, which support indirection through receiving events related to object creation as well as method invocation.

      Custom marshaling provides the ultimate form of method call indirection. Basically, the entire remoting architecture can be considered as a built-in architecture for indirection. At the higher layer, custom proxies can perform client-side data-dependent routing by examining input parameters. At the lower layer, custom channels can dynamically decide on which physical connection to send a message through. This serves as the basis for client-transparent object migration and fault tolerance. The above approach supplies indirection logic inside the proxy and channel objects. Alternatively, since these objects are themselves COM objects that are dynamically created during the unmarshaling process, indirection can also be supported by either specifying different CLSIDs or changing the CLSID to filename mapping.

    7. Versioning
    8. COM’s approach to versioning is based on the following three requirements: first, any interface (identified by an IID) must be immutable. Second, a new implementation of the same CLSID must support existing interfaces. Finally, any client must start interacting with a server by querying an interface with an IID. Such a combination allows independent evolution of client and server software.

      Suppose, on a particular machine, the server software is upgraded before the client is. Since the new server supports all old interfaces, the old client can still obtain all the interface pointers that it needs and run smoothly. When the client software is also upgraded, the new client will query the new interfaces to enjoy the new features. In contrast, suppose the client software is upgraded first on another machine. The new client will try querying the new interfaces on the old server and fail. This procedure forces the new client to handle the failure by, for example, providing only old features. But it will not cause the new client to crash or unknowingly execute incorrectly.

      Admittedly, there are still problems in practice that remain to be solved. For example, bug fixes of an existing interface implementation may change the behavior; new implementations of the same CLSID may not be willing to carry all old implementations.

    9. Server lifetime management

    COM supports a rich set of server styles that require different server lifetime management strategies. In the basic style, an activated server creates class factories for all supported CLSIDs. Object instances are created when clients make such requests. Reference counting is used to manage server lifetime: the server increments the count upon returning an interface pointer. Each client must follow a set of rules to increment the count (when duplicating an interface pointer, for example) and to decrement the count when finish using a pointer. When the count drops to zero, the server object realizes that it is no longer serving any client and can therefore be destroyed. To solve the problem of an abnormally terminated client holding a reference indefinitely, COM provides an automatic pinging mechanism [Brown98]: the client-side ping code starts sending periodic heartbeat messages to the server after an object reference is unmarshaled. It stops sending the heartbeats when the client process terminates. Upon detecting that the number of missing heartbeats has exceeded a threshold, the server-side ping code declares that the client has died and so decrements the server object’s reference count on the client’s behalf.

    Long-running, singleton objects are another style of implementing COM servers. A server process creates a set of object instances upon starting and destroys them upon exiting. Since there is no class factory, clients cannot use the usual CoCreateInstance() call. Instead, they use CoGetClassObject() to connect to these shared server objects directly. Since the server is long-running and the objects’ lifetime is determined by the server process’ lifetime, reference counting and pinging are turned off.

    Microsoft Transaction Server (MTS) [MTS98] provides yet another style of server programming. MTS server objects must be implemented in the form of DLLs that are to be hosted by MTS surrogate processes. MTS provides context objects for these server objects so that they can participate in transactions. But MTS is more than transactions. It also provides efficient resource management. In the MTS style, a server object instance notifies MTS when it has finished processing for current task and no longer needs the state associated with the instance. MTS can then reuse that instance and its associated resources to serve other clients without incurring object activation and resource initialization overhead.

  5. Summary
  6. We have described the main features of COM and DCOM. The basic model cleanly separates interfaces from implementations and supports the notion of objects with multiple interfaces. The architecture provides natural support for extensibility, indirection, and versioning. The binary standard promotes language neutrality and binary software reuse at both the application and the infrastructure layers. The infrastructure support provides location transparency and extends the benefits of object-oriented programming to networked environments. As COM gets ported to all major versions of Unix and mainframe [SAG], as COM+ [COM+98] dramatically simplifies the process of building COM applications, and as Microsoft Transaction Server evolves into a comprehensive execution management environment, COM/DCOM will become an extremely powerful R&D platform for building distributed and component-based applications.

  7. Acknowledgements
  8. We thank Galen Hunt and Mark Ryland (Microsoft) for their useful discussions.

  9. References
  1. [Box98] D. Box, Essential COM, Addison-Wesley, 1998.
  2. [Brown98] N. Brown and C. Kindel, Distributed Component Object Model Protocol -- DCOM/1.0, http://www.microsoft.com/com/, 1998.
  3. [Chung97] P. E. Chung, Y. Huang, S. Yajnik, D. Liang, J. C. Shih, C. Y. Wang, and Y. M. Wang, "DCOM and CORBA Side by Side, Step by Step, and Layer by Layer," in C++ Report, Vol. 10, No. 1, pp. 18-29,40, Jan. 1998. ( http://www.research.microsoft.com/os/ymwang/papers/C++R97CR.htm)
  4. [COM95] The Component Object Model Specification, http://www.microsoft.com/com/, 1995.
  5. [COM+98] COM+, http://www.microsoft.com/com/complus.htm, 1998.
  6. [DCE95] DCE 1.1: Remote Procedure Call Specification, The Open Group, http://www.rdg.opengroup.org/public/pubs/catalog/c706.htm.
  7. [Hunt97] G. C. Hunt and M. L. Scott, "Coign: Efficient instrumentation for inter-component communication analysis," Tech Report 648, Dept. of Computer Science, University of Rochester, Feb. 1997.
  8. [Kirtland97] M. Kirtland, "Object-oriented software development made simple with COM+ runtime services," Microsoft Systems Journal, Vol. 12, No. 11, pp. 49-59, Nov. 1997.
  9. [Nelson97] M. Nelson, DCOM Broker, http://www.wam.umd.edu/~mikenel/dcom/broker.zip.
  10. [MTS98] Microsoft Transaction Server, http://www.microsoft.com/com/mts.htm.
  11. [SAG] EntireX DCOM, Software AG, http://www.softwareag.com/corporat/solutions/entirex/dcom/default.htm.
  12. [Wang98] Y. M. Wang and W. J. Lee, "COMERA: COM Extensible Remoting Architecture," to appear in Proc. COOTS, April 1998. (http://www.research.microsoft.com/os/ymwang/papers/HTML/COMERA/F.htm)