Project "Orleans" invented the Virtual Actor abstraction, which provides a straightforward approach to building distributed interactive applications, without the need to learn complex programming patterns for handling concurrency, fault tolerance, and resource management. Orleans applications scale-up automatically and are meant to be deployed in the cloud. It has been used heavily by a number of high-scale cloud services at Microsoft, starting with cloud services for the Halo franchise running in production in Microsoft Azure since 2011. It was made available as open source in January 2015.
The main research paper that describes Orleans Virtual Actors is here.
Virtual Actor Implementations
The open-sourced C# code for Project "Orleans" is available under an MIT license on GitHub -- https://github.com/dotnet/orleans
Microsoft Azure has released an SDK for Reliable Actors, a virtual actor programming model based heavily on the Orleans API. It runs on Service Fabric, an Azure runtime for the rapid development and updating of microservice-based applications.
The BioWare division of Electronic Arts created Project "Orbit" which is a Java implementation of virtual actors that was heavily inspired by the Orleans project. Their code is available under a BSD license on GitHub -- https://github.com/electronicarts/orbit
Some Places to Learn More
Orleans GitHub site: https://github.com/dotnet/orleans
Orleans documentation web site: http://dotnet.github.io/orleans/
Various research reports, presentations and videos about Orleans are also listed on the page below.
Overview - A Framework for Cloud Computing
Building interactive services that are scalable and reliable is hard. Interactivity imposes strict constraints on availability and latency, as that directly impacts end-user experience. To support a large number of concurrent user sessions, high throughput is essential.
The traditional three-tier architecture with stateless front-ends, stateless middle tier and a storage layer has limited scalability due to latency and throughput limits of the storage layer, which has to be consulted for every request. A caching layer is often added between the middle tier and storage to improve performance. However, a cache loses most of the concurrency and semantic guarantees of the underlying storage layer. To prevent inconsistencies caused by concurrent updates to a cached item, the application or cache manager has to implement a concurrency control protocol. With or without a cache, a stateless middle tier does not provide data locality because it uses the data shipping paradigm: for every request, data is sent from storage or cache to the middle tier server that is processing the request. The advent of social graphs where a single request may touch many entities connected dynamically with multi-hop relationships makes it even more challenging to satisfy required application-level semantics and consistency on a cache with fast response for interactive access.
The actor model offers an appealing solution to these challenges by relying on the function shipping paradigm. Actors allow building a stateful middle tier that has the performance benefits of a cache with the data locality and semantic and consistency benefits of encapsulated entities via application-specific operations. In addition, actors make it easy to implement horizontal, “social”, relations between entities in the middle tier.
Another view of distributed systems programmability is through the lens of the object-oriented programming (OOP) paradigm. While OOP is an intuitive way to model complex systems, it has been marginalized by the popular service-oriented architecture (SOA). One can still benefit from OOP when implementing service components. However, at the system level, developers have to think in terms of loosely-coupled partitioned services, which often do not match the application’s conceptual objects. This has contributed to the difficulty of building distributed systems by mainstream developers. The actor model brings OOP back to the system level with actors appearing to developers very much like the familiar model of interacting objects.
Actor platforms such as Erlang and Akka are a step forward in simplifying distributed system programming. However, they still burden developers with many distributed system complexities because of the relatively low level of provided abstractions and system services. The key challenges are developing application code for managing the lifecycle of actors, dealing with distributed races, handling failures and recovery of actors, placing actors, and thus managing distributed resources. To build a correct solution to such problems in the application, the developer must be a distributed systems expert.
To avoid these complexities, we built the Orleans programming model and runtime, which raises the level of the actor abstraction. Orleans targets developers who are not distributed system experts, although our expert customers have found it attractive too. It is actor-based, but differs from existing actor-based platforms by treating actors as virtual entities, not as physical ones. First, an Orleans actor always exists, virtually. It cannot be explicitly created or destroyed. Its existence transcends the lifetime of any of its in-memory instantiations, and thus transcends the lifetime of any particular server. Second, Orleans actors are automatically instantiated: if there is no in-memory instance of an actor, a message sent to the actor causes a new instance to be created on an available server. An unused actor instance is automatically reclaimed as part of runtime resource management. An actor never fails: if a server crashes, the next message sent to an actor that was running on the failed server causes Orleans to automatically re-instantiate the actor on another server, eliminating the need for applications to supervise and explicitly re-create failed actors. Third, the location of the actor instance is transparent to the application code, which greatly simplifies programming. And fourth, Orleans can automatically create multiple instances of the same stateless actor, seamlessly scaling out hot actors.
Overall, Orleans gives developers a virtual “actor space” that, analogous to virtual memory, allows them to invoke any actor in the system, whether or not it is present in memory. Virtualization relies on indirection that maps from virtual actors to their physical instantiations that are currently running. This level of indirection provides the runtime with the opportunity to solve many hard distributed systems problems that must otherwise be addressed by the developer, such as actor placement and load balancing, deactivation of unused actors, and actor recovery after server failures, which are notoriously difficult for them to get right. Thus, the virtual actor approach significantly simplifies the programming model while allowing the runtime to balance load and recover from failures transparently.
The runtime supports indirection via a distributed directory that maps from actor identity to its current physical location. Orleans minimizes the runtime cost of indirection by using local caches of that map. This strategy has proven to be very efficient. We typically see cache hit rates of well over 90% in our production services.
Orleans has been used to build multiple production services currently running on the Microsoft Windows Azure cloud, including the back-end services for some popular games. This enabled us to validate the scalability and reliability of production applications written using Orleans, and adjust its model and implementation based on this feedback. It also enabled us to verify, at least anecdotally, that the Orleans programming model leads to significantly increased programmer productivity.
|Episode 142: Microsoft Research project Orleans simplify development of scalable cloud services
In this episode Chris Risner and Haishi Bai are joined by Sergey Bykov, Principal Development Lead at Microsoft Research on project Orleans - a framework to simplify development of scalable cloud services. In this episode Sergey discusses the motivation for building project Orleans, describes the concepts you need to know and demonstrates how you can quickly get started using it.
|Using Orleans to Build Halo 4’s Distributed Cloud Services in Azure
This talk will detail how the Halo 4 team at 343 Industries used the Orleans technology from Microsoft Research to build the cloud services that power the Halo 4 blockbuster title. Attendees will learn about the paradigm shift that the team went through to think about building cloud-native services using Orleans and how that transition resulted in their ability to rapidly design services that are simpler to maintain and evolve as well as being easier to conceptualize. Participants will leave this talk understanding how to use Orleans to build highly-concurrent, stateful services that scale-by-default. Participants will also learn how the virtual actor concept makes it easier to reason about and achieve fault-tolerance in the cloud.
|Building Real-Time Services for Halo
Video about 343 Industries building real-time services using the Orleans framework from eXtreme Computing Group.
Orleans Slide Presentations
- Manjula Peiris, James H. Hill, Jorgen Thelin, Sergey Bykov, Gabriel Kliot, and Christian Konig, PAD: Performance Anomaly Detection in Multi-Server Distributed Systems, in 7th IEEE International Conference on Cloud Computing (IEEE Cloud 2014), IEEE – Institute of Electrical and Electronics Engineers, June 2014.
- Sergey Bykov, Gabriel Kliot, Michael Roberts, and Jorgen Thelin, Orleans Best Practices, Microsoft Research, May 2014.
- Philip A. Bernstein, Sergey Bykov, Alan Geller, Gabriel Kliot, and Jorgen Thelin, Orleans: Distributed Virtual Actors for Programmability and Scalability, no. MSR-TR-2014-41, 24 March 2014.
- Sergey Bykov, Alan Geller, Gabriel Kliot, James Larus, Ravi Pandya, and Jorgen Thelin, Orleans: Cloud Computing for Everyone, in ACM Symposium on Cloud Computing (SOCC 2011), ACM, October 2011.
- Sergey Bykov, Alan Geller, Gabriel Kliot, James Larus, Ravi Pandya, and Jorgen Thelin, Orleans: A Framework for Cloud Computing, no. MSR-TR-2010-159, 30 November 2010.