Ram Pai, who wrote ccm, has some nice presentation slides for ccm. The presentation also includes some information about EVMS. You can download it here.

Here is a diagram I (Guochun Shi) extracted from code

There are 9 states in the ccm state machine, as in the diagram. The numbers denoting transition from one state to another.

States

        CCM_STATE_NONE, 
        CCM_STATE_VERSION_REQUEST,
        CCM_STATE_JOINING,      
        CCM_STATE_RCVD_UPDATE,  
        CCM_STATE_SENT_MEMLISTREQ,      
        CCM_STATE_REQ_MEMLIST,  
        CCM_STATE_MEMLIST_RES,  
        CCM_STATE_JOINED,    
        CCM_STATE_WAIT_FOR_MEM_LIST,
        CCM_STATE_WAIT_FOR_CHANGE,
        CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST,
        CCM_STATE_END

CCM_STATE_NONE

A node has not done anything yet if it is in this state. Every node is initalized to this state. If something goes bad while doing transition in other states, a node may reset itself to this state so it can start a new round.

1 --- After sending out a CCM_TYPE_PRTOVERSION message, this state change to CCM_STATE_VERSION_REQUEST

CCM_STATE_VERSION_REQUEST

A node will be in this state after it starts out a message(t = CCM_TYPE_PROTOVERSION) asking for a cluster context. After that it can either go to CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST state upon receiving a response (t = CCM_TYPE_PROTOVERSION_RESP) or timeout. In timeout, if it figures out it is the only active node in the cluster, it will enter CCM_STATE_JOINED, otherwise this round turns out to be a failure and it goes back to CCM_STATE_NONE.

2 --- If it timeouts and we still get to try more times, we reset to CCM_STATE_NONE.

3 --- Received a CCM_TYPE_PROTOVERSION_RESP message, send out CCM_TYPE_ALIVE message and change ourself to CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST state

5 --- We tried max times and still no response. If we are the highest joined, we change state to CCM_STATE_JOINED

CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST

A node in this state can transit to CCM_STATE_JOINED if receiving a CCM_TYPE_MEM_LIST messsage. This can happen if there is only one node in the previously existing cluster. If it receives a CCM_STATE_JOIN message or timesout, it will enter CCM_STATE_JOINING state.

6 --- Received a CCM_TYPE_MEM_LIST message (there is only one node in the previous cluster). It changes state to CCM_STATE_JOINED

4 -- Timeout or received a CCM_TYPE_JOIN message from any node or received a CCM_TYPE_LEAVE message from DC. It starts to send out CCM_TYPE_JOIN message and change its state to CCM_STATE_JOINING.

CCM_STATE_JOINING

A node in state will broadcast join messages until it receives responses from all nodes or timeout. Depends on whether it is the cluster leader, it will either send a request membership message (t = CCM_TYPE_REQ_MEMLIST) and enters CCM_STATE_SENT_MEMLISTREQ state or reply to the membership request message and enters into CCM_STATE_MEMLIST_RES state.

7 --- If ccm received CCM_TYPE_JOINING message from all nodes and I am not the leader or timeouts, change into CCM_STATE_RES_MEMLIST

8 -- If ccm received CCM_TYPE_JOINING message from all nodes and I am the leader or timeouts, change into CCM_STATE_RES_MEMLIST

18 -- if ccm has exceeded bigger time, we reset ourselve and change into state CCM_STATE_NONE

CCM_STATE_MEMLIST_MEMLISTREQ

This is the potential leader state. A node in this state may get all responses, computer membership and enters into CCM_STATE_JOINED state or it may go to CCM_STATE_NONE if something goes bad.

17 -- On receiving a CCM_TYPE_TIMEOUT/CCM_TYPE_REQ_MEMLIST/CCM_TYPE_RES_MEMLIST message and find itself already exceed info->itf timeout, therefore change into state CCM_STATE_NONE.

10 -- On receiving a CCM_TYPE_TIMEOUT and it has not yet exceed info->itf timeout yet, or a CCM_TYPE_RES_MEMLIST/CCM_TYPE_LEAVE message and find itself have received all CCM_TYPE_RES_MEMLIST and has not exceeded info->itf timeout yet, send out CCM_TYPE_FINAL_MEMLIST and change into CCM_STATE_JOINED as leader.

CCM_STATE_MEMLIST_RES

A node in this state is expecting a message t = CCM_TYPE_FINAL_MEMLIST from the cluster leader. If it gets it, it enters into CCM_STATE_JOINED. Otherwise it will enter CCM_STATE_JOINING again to start a new round. That could happen if the cluster leader dies in this step.

9 -- on receiving a CCM_TYPE_FINAL_MEMLIST message, ccm change into state CCM_STATE_JOINED.

19 -- on receiving a CCM_TYPE_REQ_MEMLIST again but minor transaction number does not match, we reset ourselve to state CCM_STATE_NONE

20 -- on receiving a CCM_TYPE_JOINING with greater trans_minor value, or timeout, or some node that we think as cluster leader left.

CCM_STATE_JOINED

This is a coverged state for each node if it goes well. If some node started a new round by sending join messages (t = CCM_TYPE_JOIN) or the cluster leader dies, then this node will also start to broadcast join messages and enters into CCM_STATE_JOINING state. Otherwise, it will record changes and go to CCM_STATE_WAIT_FOR_CHANGE.

Upon a new node joining an already converged cluster, the ideal case will be: the new node sends out "I am alive" message (t = CCM_TYPE_ALIVE") and enters into CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST state. The non-cluster-leader nodes, upon receiving this message, sends the cluster leader about this information. The cluster leader collects all messages and replies to the new node with a new membership. If anything fails in this process, one of them -- the cluster leader or the new node -- by timeout-- will initialiate a join protocol and everyone will be be in CCM_STATE_JOINING state.

16 -- On receiving a CCM_TYPE_JOIN message with minor tranction number greater than the local one. Or on receiving a CCM_TYPE_LEAVE with the leaving node being the leader. Send out a CCM_TYPE_JOIN messeage and change into CCM_STATE_JOINING.

12 -- (a) We are the leader, on receiving a CCM_TYPE_LEAVE message with the leaving node not being the leader and we have not received all change message yet

action: ccm changes its state to CCM_STATE_WAIT_FOR_CHANGE.

11 -- (a) We are not the leader, on receiving a CCM_TYPE_LEAVE message with the leaving node not being the leader and we have not received all change message yet

action: ccm changes its state to CCM_STATE_WAIT_FOR_MEM_LIST.

CCM_STATE_WAIT_FOR_CHANGE

This is a help state to process requests in CCM_STATE_JOINED.

13 -- on receiving a CCM_TYPE_LEAVE/CCM_TYPE_NODE_LEAVE/CCM_TYPE_ALIVE/CCM_TYPE_NEW_NODE and ccm has received all necessary messages, change into state CCM_STATE_JOINED.

15 -- on receiving a CCM_TYPE_LEAVE/CCM_TYPE_NODE_LEAVE/CCM_TYPE_ALIVE/CCM_TYPE_NEW_NODE but they are not expected messages, or timeouts, or on receiving a CCM_TYPE_JOIN message, change into state CCM_STATE_JOINING

CCM_STATE_WAIT_FOR_MEM_LIST This comes from non-leader cluster nodes after it sends out a new node message in CCM_STATE_JOINED state. It is expecting a CCM_TYPE_MEM_LIST message. The node will enter CCM_STATE_JOINING if it does not get that membership list.

14 -- on Receiving a CCM_TYPE_MEM_LIST message, we change state to CCM_STATE_JOINED

15 -- (a) Timeout. We did not get CCM_TYPE_MEM_LIST message we are expecting.

action: We change into CCM_STATE_JOINING and send out a CCM_TYPE_JOIN message.

Message types

        CCM_TYPE_PROTOVERSION,
        CCM_TYPE_PROTOVERSION_RESP,
        CCM_TYPE_JOIN,
        CCM_TYPE_REQ_MEMLIST,
        CCM_TYPE_RES_MEMLIST,
        CCM_TYPE_FINAL_MEMLIST,
        CCM_TYPE_MEM_LIST,
        CCM_TYPE_ABORT,
        CCM_TYPE_TIMEOUT,
        CCM_TYPE_LEAVE,
        CCM_TYPE_NODE_LEAVE,
        CCM_TYPE_ALIVE,
        CCM_TYPE_NEW_NODE,
        CCM_TYPE_LAST

CCM_TYPE_PROTOVERSION

This message is only sent out by a node in state CCM_STATE_NONE. CCM will change into state CCM_STATE_VERSION_REQUEST after sendingout this message.

CCM_TYPE_PROTOVERSION_RESP

This message is only sent out by a node who is leader or about to become the leader in state CCM_STATE_JOINED to a node who want to join. CCM will stay in CCM_STATE_JOINED or become CCM_STATE_JOINED after sending out this message. However this message is not a necessary step to stay/change into CCM_STATE_JOINED message.

CCM_TYPE_FINAL_MEMLIST

This message is only sent out by by ccm in state CCM_STATE_SENT_MEMLISTREQ if ccm timeouts but it has not exceeded info->itf yet, or on receiving CCM_TYPE_RES_MEMLIST and ccm received all CCM_TYPE_RES_MEMLIST messages and it has not exceeded info->itf, or ccm received CCM_TYPE_RES_MEMLIST and that makes ccm receive all CCM_TYPE_RES_MEMLIST messages.

CCM_TYPE_JOIN

This message is sent out in lots of cases, e.g. when ccm receives a CCM_TYPE_LEAVE/CCM_TYPE_NEW_NODE messages. CCM will stay or change into CCM_STATE_JOINING after it sends out this message.

CCM_TYPE_REQ_MEMLIST

This message is sent out by ccm, on receiving a CCM_TYPE_TIMEOUT or CCM_TYPE_JOIN message, either our wait time expires or we have received all response from all nodes, We decide we are the leader and send a CCM_TYPE_REQ_MEMLIST message to the whole cluster.

CCM_TYPE_RES_MEMLIST This message is send out in following cases

a) ccm in state CCM_STATE_RES_MEMLIST/CCM_STATE_REQ_MEMLIST and received a CCM_STATE_REQ_MEMLIST, which means some other node think it is the leader but we don't think so. We send out a CCM_TYPE_RES_MEMLIST message with NULL message.

b) On receiving a CCM_TYPE_TIMEOUT or CCM_TYPE_JOIN message, either our wait time expires or we have received all response from all nodes, We decide we are not the cluster leader and send a CCM_TYPE_RES_MEMLIST message with valid membership to the valid cluster leader and with NULL membership to any invalid cluster.

CCM_TYPE_ABORT

This message does not seem to be necessary. It shall be removed after everything is clear.

CCM_TYPE_TIMEOUT

This message is sent out by ccm in case of timeout, which is called by hb_timeout_dispatch().

CCM_TYPE_LEAVE

This message is only sent out when ccm dies in some nodes. There are two ways that we know ccm dies in some node, a) we received a T_APICLISTAT message with F_STATUS != JOINSTATUS. b) We detecting some nodes are dead.

FIXME: is ccm client status callback garanteed? if yes, we don't need to handle case b)

CCM_TYPE_NODE_LEAVE

This message only sent out by a node who is not leader and in state CCM_STATE_JOINED on receiving a message CCM_TYPE_LEAVE. CCM will change into state CCM_STATE_WAIT_FOR_MEM_LIST after sending out this message.

CCM_TYPE_ALIVE

This message is only sent out by a node in state CCM_STATE_VERSION_REQUEST, on receiving a CCM_TYPE_PROTOVERSION_RESP message. ccm changes to CCM_STATE_NEW_NODE_WAIT_FOR_MEM_LIST after sending out this message.

CCM_TYPE_NEW_NODE

This message is only sent out by a non-leader node in CCM_STATE_JOINED on receiving CCM_TYPE_ALIVE message ccm changes to CCM_STATE_WAIT_FOR_MEM_LIST after sending out this message.

CCM_TYPE_LAST

Invalid message, never used.

CCMStateMachine (last edited 2005-03-08 17:50:20 by Guochun Shi)