Clever Geek Handbook
πŸ“œ ⬆️ ⬇️

Distributed Lock Manager

Operating systems use lock managers ( English ) to organize and coordinate access to resources. Distributed lock manager ( Eng. Distributed lock manager , DLM , [1] ) runs on each machine in the cluster, with an identical copy of the cluster lock database. Thus, DLM is a software package that allows computers in a cluster to coordinate access to shared resources.

Various DLM implementations have been used as the basis for several successful cluster file systems in which machines in the cluster can be used to store each other's files using a single file system, with significant advantages in terms of increased performance and availability. The main performance benefit is achieved by solving the disk cache coherence problem between participating computers. DLM is used not only to lock files, but also to coordinate all types of disk access. VMS Π‘luste r (formerly called VAX Cluster ), the first widely used clustering system, relies on VAX / VMS (later OpenVMS ) DLM in this way.

Content

  • 1 VMS implementation
    • 1.1 Resources
    • 1.2 Lock Modes
    • 1.3 Obtaining a lock
    • 1.4 Block value block
    • 1.5 Definition of deadlock situations
  • 2 Linux Clustering
  • 3 Chubby, a blocking service from Google
  • 4 ZooKeeper
  • 5 ETCD
  • 6 Redlock Algorithm in Redis
  • 7 notes

VMS Implementation

DEC VAX / VMS was the first widely available operating system to implement DLM. This functionality appeared in version 4, although the user interface was the same as the uniprocessor lock manager, which was first implemented in version 3.

Resources

DLM uses the generalized concept of a resource as a kind of object to which shared access should be controlled. It could be a file, a record, a shared memory area, or anything else that the application developer has chosen. The developer himself describes the hierarchy of resources, so that he can determine the required number of levels of blocking. For example, a hypothetical database could describe a hierarchy of resources as follows:

  • Database
  • Table
  • record
  • field

Thus, the process as part of the execution gets the opportunity to set the necessary locks on the database as a whole (parent resource), and then on the individual parts of the database (subordinate resources).

Lock Modes

A process running inside a VMS Cluster may receive a resource lock. DLM implements six locking modes, and each of them defines different levels of provided exclusivity (compatibility). In the process of using the resource, you can convert the lock mode level to a higher or lower one. When all processes unlock the resource, the system information about the resource is destroyed.

  • Null (NL). Indicates interest in a resource, but does not interfere with other processes by locking it. This mode provides a convenient mechanism for creating a resource and saving its value as a blocking block . Usually used to declare the existence of an object on a particular node for cluster members (VMS Cluster members).
  • Concurrent Read (CR). Indicates the intention to read (but not update) the resource. This allows other processes to read or update the resource, but does not allow others to gain exclusive access to it. This mode, as a rule, is used on resources of a high level of the hierarchy, given that more rigid locks can be obtained for resources subordinate to it.
  • Concurrent Write (CW). Indicates the intention to read and update the resource. This mode also allows other processes to read or update a resource, but does not allow others to gain exclusive access to it. This mode is also used on resources of a high level in the hierarchy, given that tighter locks can be obtained for child resources.
  • Protected Read (PR). This is the traditional general blocking mode, which indicates a desire to read a resource, but does not allow others to update its contents. Other processes, however, may also read the contents of the resource.
  • Protected Write (PW). This is the traditional update blocking mode, which indicates the desire to read and update the resource and does not allow other users to update it. Other processes may receive the β€œParallel Read” access mode and may read the resource.
  • Exclusive (EX). This is a traditional exclusive lock that allows you to read and update access to a resource, and does not allow other processes to have access to it.

The following truth table shows the compatibility of each lock mode with others:

ModeNlCRCwPRPwEX
NlYesYesYesYesYesYes
CRYesYesYesYesYesNo
CwYesYesYesNoNoNo
PRYesYesNoYesNoNo
PwYesYesNoNoNoNo
EXYesNoNoNoNoNo

Getting Lock

A process can block a resource by queuing (SYS $ ENQ) a blocking request. This is similar to Queue IO , the VMS technology used to perform I / O. A lock request can be executed either completely synchronously, in which case the process waits until the lock is issued, or asynchronously, in which case the asynchronous system interrupt (AST) mechanism is triggered when the lock is received.

In addition, you can install a blocking AST (Blocking AST), which is triggered when a process has received a lock, preventing access to the resource by another process (in the jargon of developers VMS are referred to as DoorBell AST ). The original process can then, if necessary, take measures to allow others access (for example, by lowering or removing the lock). Blocking AST - provides application developers with a convenient way to coordinate instances of applications in cases where only one instance is allowed, and the rest are put on hold. An example of use in VMS Cluster is Cluster Node Alias, DECNet / LAT / IP address, which migrates depending on the degree of load ( Load Balancing ) of the node or its availability ( Failover ).

Block value block

A 32 byte size block value block is associated with each resource. It can be read when any type of lock is received (except for a NULL lock) and can be updated using a process that received a PW or EX level resource lock.

It can be used to store any information about the resource that the application developer chooses. Usage example: storing the version number of a resource. Each time an object associated with it (for example, a record in the database) is updated, the owner of the lock increments the block of the lock value. When another process wants to read the resource, it receives the corresponding lock and compares the current lock value with the value that was the last time the process accessed the locked resource. If the value is the same, the process knows that the resource associated with it has not been updated since the last time it was read, and therefore there is no need to read it again. Therefore, this method can be used to implement various types of cache in a database or similar applications.

Another example of use: interprocess communication ( IPC - InterProcess Communication ) - in cases where high reactivity is required for the exchange of small pieces of data (within 32 bytes) between processes on different VMS Cluster nodes with low latency requirements. For the exchange of larger portions of data (up to 1 Mb), ICC technology ( IntraCluster Communication service , SYS $ ICC [2] ) is used.

Identifying deadlock situations

Deadlock - a situation in which several processes are in a state of endless waiting for resources occupied by these processes themselves. E. Dijkstra originally called this situation a β€œdeath hug” [3] .

OpenVMS DLM periodically checks processes for deadlock situations. In the case when one process blocks the resource 1, waiting for the release of the resource 2, blocked by the second process, which in turn expects the release of the resource 1, the second process causes the status of the deadlock. In this case, measures are taken to exit the deadlock state, freeing up the resource that was first blocked from the lock.

Linux clustering

In January 2006, the OCFS2 (Oracle Cluster File System) code [4] proposed by the programmers of the eponymous corporation was included in the Linux kernel version 2.6.16; in November 2006, the cluster software support code [5] from the corporation was added to the kernel 2.6.19 Red Hat , in particular, support for the GFS2 file system. Both systems are based on the successful VMS DLM model [6] . At the same time, the distributed lock manager from Oracle had a simplified API - its basic function dlmlock() had only 8 parameters, when as a similar system call SYS$ENQ in VMS, as well as the dlm_lock function in DLM from Red Hat dlm_lock had 11 parameters.

Chubby, Google's Blocking Service

Google has developed its implementation of a blocking service for loosely coupled distributed systems called Chubby [7] . This service is designed to provide Coarse Grained Lock, as well as to support a functionally limited but reliable distributed file system. Key parts of Google’s infrastructure, including the Google File System , BigTable, and MapReduce , use Chubby to synchronize access to shared resources. Although the Chubby service was originally developed as a blocking service, it is now widely used by Google as a name server , replacing DNS [7] .

ZooKeeper

Apache Zookeeper , a project of the Apache Software Foundation , is a distributed hierarchical repository of keys and values that is used to provide a distributed configuration service, synchronization service and name registry for large distributed systems [8] . ZooKeeper can also be used as a distributed lock manager [9] . Zookeeper was originally a sub-project under Hadoop , but is currently on the main list of ASF projects .

The Zookeeper architecture supports high availability through service redundancy. In this way, customers can initiate the election of another Zookeeper leader if the current one does not respond. Zookeeper nodes store their data in a hierarchical namespace similar to a file system or data tree structure [10] .

Zookeeper is used by many companies, including Rackspace , Yahoo! [11] , Odnoklassniki , Reddit [12] , Yandex [13] and eBay , as well as the open-source full-text search platform Solr [14] .

ETCD

The etcd , which allows updating node settings within the CoreOS cluster infrastructure, also provides the capabilities of a distributed lock manager [15] .

Redis Redis Algorithm

The open-source non - relational high - performance Redis DBMS , which is an open-source network-based journaling data storage of key-value type, can be used to implement the Redlock distributed lock control algorithm [16] .

Notes

  1. ↑ Lawrence Kenah, Ruth Goldenberg. VAX / VMS Internals and Data Structures: Version 5.2 . - Bedford, MA: Digital Press, 1987-12-21. - 1427 p. - ISBN 9781555580599 .
  2. ↑ Hewlett-Packard Company Palo Alto, California. HPOpenVMSSystemServices ReferenceManual.
  3. ↑ Gehani, Narain. Ada: concurrent programming . - Silicon Press, 1991-01-01. - ISBN 9780929306087 .
  4. ↑ kernel / git / torvalds / linux.git - Linux kernel source tree git.kernel.org. Date of treatment February 14, 2017.
  5. ↑ kernel / git / torvalds / linux.git - Linux kernel source tree git.kernel.org. Date of treatment February 14, 2017.
  6. ↑ The OCFS2 filesystem [LWN.net ] (unspecified) . lwn.net. Date of treatment February 14, 2017.
  7. ↑ 1 2 Google Research Publication: Chubby Distributed Lock Service (unopened) . research.google.com. Date of treatment February 14, 2017.
  8. ↑ Index - Apache ZooKeeper - Apache Software Foundation (Neopr.) . cwiki.apache.org. Date of treatment February 14, 2017.
  9. ↑ ZooKeeper Recipes and Solutions (unopened) (unavailable link) . zookeeper.apache.org. Date of treatment February 14, 2017. Archived February 16, 2017.
  10. ↑ ProjectDescription - Apache ZooKeeper - Apache Software Foundation (neopr.) . cwiki.apache.org. Date of treatment February 14, 2017.
  11. ↑ ZooKeeper / PoweredBy - Hadoop Wiki (unopened) (inaccessible link) . wiki.apache.org. Date of treatment February 14, 2017. Archived December 9, 2013.
  12. ↑ Why Reddit was down on Aug 11 β€’ / r / announcements (Russian) . reddit. Date of treatment February 14, 2017.
  13. ↑ ZooKeeper as a guaranteed delivery system for Yandex.Mail (unopened) . habr. Date of treatment June 28, 2019.
  14. ↑ SolrCloud - Apache Solr Reference Guide - Apache Software Foundation (Neopr.) . cwiki.apache.org. Date of treatment February 14, 2017.
  15. ↑ etcd / demo.md at master Β· coreos / etcd Β· GitHub github.com Date of treatment February 14, 2017.
  16. ↑ Distributed locks with Redis - Redis (neopr.) . redis.io. Date of treatment February 14, 2017.
Source - https://ru.wikipedia.org/w/index.php?title=Distributed_Lock_Manager&oldid=102611285


More articles:

  • Nihil admirari
  • Minamoto no Yoshiye
  • Kumbaro Mirela
  • Erdinger
  • Sierra Cabrera, Jose Luis
  • Double pawns
  • Brazhkin, Vadim Veniaminovich
  • Fruit of the Loom
  • Koksu (tributary of Karatal)
  • EF-131

All articles

Clever Geek | 2019