Bounded Retransmission Protocol
Please note: The syntax of the examples presented here is not compatible with current versions of mcpta. The mcpta download includes updated model files.
The Bounded Retransmission Protocol, or BRP, is an industrial data link protocol proposed by Philips for transmitting files, partitioned into chunks, over unreliable channels [HSV93]. It is a variant of the Alternating Bit Protocol with a bounded number of retransmissions per chunk. The BRP has been modelled and verified with a multitude of different formalisms and methods. In particular, timed and functional properties have been checked using a timed automata model [DKRT97], while a probabilistic model based on Markov decision processes has been studied both using the RAPTURE verification tool [DJJL01] and PRISM.
Our Modest models combine the timed and probabilistic models into a probabilistic timed model. This allows us to not only check the properties from the previously mentioned case studies, but also probabilistic time-bounded as well as expected-time reachability properties.
Modelling
The model's parameters are N, the number of frames per file, MAX, the maximum number of retransmissions per frame, and TD, the maximum transmission delay of the communication channel.
Flat model
Our first model is composed of otherwise flat processes for the partners involved, such as sender and receiver. While this model is functionally correct, it is rather difficult to understand, debug and extend:
// Modest model of the Bounded Retransmission Protocol (BRP) action put, get, put_k, get_k, put_l, get_l; action new_file; action s_ok, s_dk, s_nok, s_restart; action r_ok, r_inc, r_fst, r_nok, r_timeout; exception error, channel_overflow, premature_timeout; // Constants const int N; // number of frames per file const int MAX; // maximum number of retransmissions per frame const int TD; // transmission delay const int TS = 2 * TD + 1; // sender timeout const int TR = 2 * MAX * TS + 3 * TD; // receiver timeout const int SYNC = TR; bool ff, lf, ab; // Channel data: first/last frame, alternating bit int i limit [0..N]; // Sender chunk counter bool inTransitK = false; bool inTransitL = false; bool T; bool first_file_done = false; // Observation constraint not for deadline properties observe get_k, s_ok, s_dk, s_nok, s_restart, r_ok, r_inc, r_fst, r_nok, r_timeout when T; // Properties // Invariant (timed) properties (from [DKRT97]) // "there is at most one message in transit for each channel" property T_1 = invariant !threw(channel_overflow); // "there is at most one message in transit in total" property T_2 = invariant !(inTransitK && inTransitL); // Assumption (A1): "no premature timeouts" property T_A1 = invariant !threw(premature_timeout); // Assumption (A2): "sender starts new file only after receiver reacted to failure" // Note: receiver can only notice failure if it received at least one chunk, i.e. did(get_k) property T_A2 = invariant !did(s_restart) || !did(get_k) || did(r_timeout); // Probabilistic reachability properties (from [DJJL01], the RAPTURE/PRISM model) // property A of [DJJL01]: "the maximum probability that eventually the sender reports // a certain unsuccessful transmission but the receiver got the complete file" property P_A = Pmax(<> did(s_nok) && did(r_ok)); // property B of [DJJL01]: "the maximum probability that eventually the sender reports // a certain successful transmission but the receiver did not get the complete file" property P_B = Pmax(<> did(s_ok) && !did(r_ok)); // property 1 of [DJJL01]: "the maximum probability that eventually the sender // does not report a successful transmission" property P_1 = Pmax(<> did(s_nok) || did(s_dk)); // property 2 of [DJJL01]: "the maximum probability that eventually the sender // reports an uncertainty on the success of the transmission" property P_2 = Pmax(<> did(s_dk)); // property 3 of [DJJL01]: "the maximum probability that eventually the sender reports // an unsuccessful transmission after more than 8 chunks have been sent successfully" property P_3 = Pmax(<> did(s_nok) && i > 8); // property 4 of [DJJL01]: "the maximum probability that eventually the receiver // does not receive any chunk and the sender tried to send a chunk" property P_4 = Pmax(<> (did(s_ok) || did(s_nok) || did(s_dk)) && !did(get_k)); // Probabilistic time-bounded reachability properties // "the maximum/minimum probability that the sender reports // a successful transmission within 64 time units" property D_max = Pmax(<> did(s_ok) && time <= 64); property D_min = Pmin(<> did(s_ok) && time <= 64); // Expected reachability properties // "the maximum/minimum expected time until the transfer // of the first file is finished (successfully or unsuccessfully)" property E_max = Tmax(first_file_done); property E_min = Tmin(first_file_done); process Sender() { bool bit; int rc limit [0..MAX]; clock c; do { :: invariant(c <= 0) new_file {= i=0, rc=0 =}; try { do { :: when(i < N) urgent {= i=i+1 =}; do { :: // send frame invariant(c <= 0) put_k {= ff=(i==1), lf=(i==N), ab=bit, c=0 =}; invariant(c <= TS) alt { :: get_l {= bit=!bit, rc=0, c=0 =}; // ack received urgent break :: when(c == TS && rc < MAX) // timeout, retry {= rc=rc+1, c=0 =} :: when(c == TS && rc == MAX && i < N) // timeout, no retries left s_nok {= rc=0, c=0 =}; urgent throw(error) :: when(c == TS && rc == MAX && i == N) // timeout, no retries left s_dk {= rc=0, c=0 =}; urgent throw(error) } } :: when(i == N) // file transmission successfully completed urgent s_ok {= first_file_done=true =}; urgent break } } catch error { // File transfer did not succeed: wait, then restart with next file invariant(c <= SYNC) when(c == SYNC) s_restart {= bit=false, first_file_done=true =} } } } process Receiver() { bool r_ff, r_lf, r_ab; bool bit; clock c; // receive first frame alt { :: when(ff) get_k {= c=0, bit=ab, r_ff=ff, r_lf=lf, r_ab=ab =} :: when(!ff) get_k {= c=0 =}; urgent throw(premature_timeout) }; do { :: invariant(c <= 0) alt { :: when(r_ab != bit) put_l // repetition, re-ack :: when(r_ab == bit) alt { // report frame :: when(r_lf) r_ok :: when(!r_lf && r_ff) r_fst :: when(!r_lf && !r_ff) r_inc }; invariant(c <= 0) put_l {= bit=!bit =} }; invariant(c <= TR) alt { :: // receive next frame get_k {= c=0, r_ff=ff, r_lf=lf, r_ab=ab =} :: // timeout when(c == TR) alt { :: when(r_lf) // we just got the last frame, though r_timeout; urgent break :: when(!r_lf) r_nok; // abort transfer urgent r_timeout; urgent break } } }; Receiver() } process ChannelK() { clock c; put_k palt { :98: {= c = 0, inTransitK = true =}; invariant(c <= TD) alt { :: get_k {= inTransitK = false =} :: put_k {==}; urgent throw(channel_overflow) } :2: {==} }; ChannelK() } process ChannelL() { clock c; put_l palt { :99: {= c = 0, inTransitL = true =}; invariant(c <= TD) alt { :: get_l {= inTransitL = false =} :: put_l {==}; urgent throw(channel_overflow) } :1: {==} }; ChannelL() } process Checker() { when(!T) alt { :: new_file {= T = true =} :: new_file {= T = false =} }; Checker() } par { :: Sender() :: Receiver() :: ChannelK() :: ChannelL() :: Checker() }
Modular model
Due to the aesthetic problems with the flat model, we have built a second, more modular one that is easy to understand and extend:
// Modest model of the Bounded Retransmission Protocol (BRP) action put, get, put_k, get_k, put_l, get_l; action new_file; action s_ok, s_dk, s_nok, s_restart; action r_ok, r_inc, r_fst, r_nok, r_timeout; exception error, channel_overflow, premature_timeout; // Constants const int N; // number of frames per file const int MAX; // maximum number of retransmissions per frame const int TD; // transmission delay const int TS = 2 * TD + 1; // sender timeout const int TR = 2 * MAX * TS + 3 * TD; // receiver timeout const int SYNC = TR; bool ff, lf, ab; // Channel data: first/last frame, alternating bit int i limit [0..N]; // Sender chunk counter bool inTransitK = false; bool inTransitL = false; bool T; bool first_file_done = false; // Observation constraint not for deadline properties observe get_k, s_ok, s_dk, s_nok, s_restart, r_ok, r_inc, r_fst, r_nok, r_timeout when T; // Properties // Invariant (timed) properties (from [DKRT97]) // "there is at most one message in transit for each channel" property T_1 = invariant !threw(channel_overflow); // "there is at most one message in transit in total" property T_2 = invariant !(inTransitK && inTransitL); // Assumption (A1): "no premature timeouts" property T_A1 = invariant !threw(premature_timeout); // Assumption (A2): "sender starts new file only after receiver reacted to failure" // Note: receiver can only notice failure if it received at least one chunk, i.e. did(get_k) property T_A2 = invariant !did(s_restart) || !did(get_k) || did(r_timeout); // Probabilistic reachability properties (from [DJJL01], the RAPTURE/PRISM model) // property A of [DJJL01]: "the maximum probability that eventually the sender reports // a certain unsuccessful transmission but the receiver got the complete file" property P_A = Pmax(<> did(s_nok) && did(r_ok)); // property B of [DJJL01]: "the maximum probability that eventually the sender reports // a certain successful transmission but the receiver did not get the complete file" property P_B = Pmax(<> did(s_ok) && !did(r_ok)); // property 1 of [DJJL01]: "the maximum probability that eventually the sender // does not report a successful transmission" property P_1 = Pmax(<> did(s_nok) || did(s_dk)); // property 2 of [DJJL01]: "the maximum probability that eventually the sender // reports an uncertainty on the success of the transmission" property P_2 = Pmax(<> did(s_dk)); // property 3 of [DJJL01]: "the maximum probability that eventually the sender reports // an unsuccessful transmission after more than 8 chunks have been sent successfully" property P_3 = Pmax(<> did(s_nok) && i > 8); // property 4 of [DJJL01]: "the maximum probability that eventually the receiver // does not receive any chunk and the sender tried to send a chunk" property P_4 = Pmax(<> (did(s_ok) || did(s_nok) || did(s_dk)) && !did(get_k)); // Probabilistic time-bounded reachability properties // "the maximum/minimum probability that the sender reports // a successful transmission within 64 time units" property D_max = Pmax(<> did(s_ok) && time <= 64); property D_min = Pmin(<> did(s_ok) && time <= 64); // Expected reachability properties // "the maximum/minimum expected time until the transfer // of the first file is finished (successfully or unsuccessfully)" property E_max = Tmax(first_file_done); property E_min = Tmin(first_file_done); process Sender() { exception timeout; bool bit; clock c; process LinkLayerSendData() { clock c; invariant(c <= 0) put_k {= ff=(i==1), lf=(i==N), ab=bit =} } process LinkLayerReceiveAck() { get_l } process WaitAck() { clock c; invariant(c <= TS) alt { :: LinkLayerReceiveAck(); urgent break :: when(c == TS) throw(timeout) } } process SendChunk() { int rc limit [0..MAX]; do { :: try { urgent tau par { :: LinkLayerSendData() :: WaitAck() } } catch timeout { alt { :: when(rc < MAX) urgent {= rc=rc+1 =} // retry :: when(rc == MAX) urgent throw(error) // no retries left } } } } invariant(c <= 0) new_file {= i=0 =}; try { do { :: when(i < N) urgent {= i=i+1 =}; SendChunk(); urgent {= bit=!bit =} :: when(i == N) // file transmission successfully completed urgent s_ok {= first_file_done=true =}; urgent break } } catch error { urgent alt { :: when(i < N) s_nok {= c=0 =} :: when(i == N) s_dk {= c=0 =} }; // file transmission not successful invariant(c <= SYNC) when(c == SYNC) s_restart {= c=0, bit=false, first_file_done=true =} }; Sender() } process Receiver() { exception timeout; bool r_ff, r_lf, r_ab; bool bit; clock c; process LinkLayerReceiveData() { get_k {= c = 0, bit=ab, r_ff=ff, r_lf=lf, r_ab=ab =} } process LinkLayerSendAck() { invariant(c <= 0) put_l } process WaitData() { invariant(c <= TR) alt { :: LinkLayerReceiveData() :: when(c == TR) throw(timeout) } } // receive the first frame of a file LinkLayerReceiveData(); urgent alt { :: when(ff) tau :: when(!ff) throw(premature_timeout) }; // receive the rest of the file do { :: // report frame to upper layer urgent alt { :: when(r_ab != bit) tau // repetition :: when(r_ab == bit) alt { :: when(r_lf) r_ok {= bit=!bit =} :: when(!r_lf && r_ff) r_fst {= bit=!bit =} :: when(!r_lf && !r_ff) r_inc {= bit=!bit =} } }; // acknowledge LinkLayerSendAck(); // wait for next frame try { WaitData() } catch timeout { urgent r_timeout; urgent alt { :: when(r_lf) break // we just got the last frame :: when(!r_lf) r_nok; urgent break // abort transfer } } }; Receiver() } process ChannelK() { clock c; put_k palt { :98: {= c = 0, inTransitK = true =}; invariant(c <= TD) alt { :: get_k {= inTransitK = false =} :: put_k {==}; urgent throw(channel_overflow) } :2: {==} }; ChannelK() } process ChannelL() { clock c; put_l palt { :99: {= c = 0, inTransitL = true =}; invariant(c <= TD) alt { :: get_l {= inTransitL = false =} :: put_l {==}; urgent throw(channel_overflow) } :1: {==} }; ChannelL() } process Checker() { when(!T) alt { :: new_file {= T = true =} :: new_file {= T = false =} }; Checker() } par { :: Sender() :: Receiver() :: ChannelK() :: ChannelL() :: Checker() }
Properties
The following properties have been checked using both Modest models of the BRP:
Timed (invariant) properties
These timed invariant properties have been taken from [DKRT97] and hold for both the timed automata and the Modest models.
-
T1: there is at most one message in transit for each channel
invariant !threw(channel_overflow) -
T2: there is at most one message in transit in total
invariant !(inTransitK || inTransitL) -
TA1: no premature timeouts
invariant ¬threw(premature_timeout) -
TA2: sender starts new file only after receiver reacted to failure
invariant !did(get_k) || !did(s_restart) && did(r_timeout)
Probabilistic reachability properties
These properties habe already been studied in the PRISM/RAPTURE models, and we can reproduce the exact results on both Modest models.
-
PA: the max. prob. that eventually the sender reports a certain unsuccessful transmission but the receiver got the complete file
Pmax(<> did(s_nok) || did(r_ok)) -
PB: the max. prob. that the sender reports a certain successful transmission but the receiver did not get the complete file
Pmax(<> did(s_ok) && !did(r_ok)) -
P1: the maximum probability that eventually the sender does not report a successful transmission
Pmax(<> did(s_ok) || did(s_dk)) -
P2: the maximum probability that eventually the sender reports an uncertainty on the success of the transmission
Pmax(<> did(s_dk)) -
P3: the max. prob. that the sender reports an unsuccessful transmission after more than 8 chunks have been sent successfully
Pmax(<> did(s_nok) && i > 8) -
P4: the maximum probability that eventually the receiver does not receive any chunk and the sender tried to send a chunk
Pmax(<> (did(s_ok) || did(s_nok) || did(s_dk)) && !did(get_k))
Expected-time reachability properties
-
Emax: the maximum expected time until the transfer of the first file is finished (successfully or unsuccessfully)
Tmax(first_file_done) -
Emin: the minimum expected time until the transfer of the first file is finished (successfully or unsuccessfully)
Tmin(first_file_done)
Probabilistic time-bounded reachability properties
-
Dmax: the maximum probability that the sender reports a successful transmission within 64 time units
Pmax(<> did(s_ok) && time <= 64) -
Dmin: the minimum probability that the sender reports a successful transmission within 64 time units
Pmin(<> did(s_ok) && time <= 64)
Results and Performance
We checked all properties for four different combinations of (N, MAX, TD) on both models. We used PRISM 3.3 with the "sparse" engine, which performed best.
Results
The following table lists the model-checking results for the properties we checked. We note that these results match those obtained from previous models where applicable.
| (16, 2, 1) | (16, 2, 4) | (64, 5, 1) | (64, 5, 4) | |
|---|---|---|---|---|
| T1 | true | true | true | true |
| T2 | true | true | true | true |
| TA1 | true | true | true | true |
| TA2 | true | true | true | true |
| PA | 0.0 | 0.0 | 0.0 | 0.0 |
| PB | 0.0 | 0.0 | 0.0 | 0.0 |
| P1 | 4.233 ∙ 10−4 | 4.233 ∙ 10−4 | 4.482 ∙ 10−8 | 4.482 ∙ 10−8 |
| P2 | 2.645 ∙ 10−5 | 2.645 ∙ 10−5 | 7.003 ∙ 10−10 | 7.003 ∙ 10−10 |
| P3 | 1.852 ∙ 10−4 | 1.852 ∙ 10−4 | 3.852 ∙ 10−8 | 3.852 ∙ 10−8 |
| P4 | 8.000 ∙ 10−6 | 8.000 ∙ 10−6 | 6.400 ∙ 10−11 | 6.400 ∙ 10−11 |
| Emax | 33.473 | 132.413 | 133.897 | 529.691 |
| Emin | 1.480 | 4.442 | 5.897 | 17.692 |
| Dmax | 1.000 | 1.000 | 1.000 | 0.999 |
| Dmin | 1.000 | 0.0 | 0.0 | 0.0 |
We obtained the same results from both the flat and the modular model.
State-space size
Because the time-bounded reachability properties introduce a certain overhead, we checked them in a separate run, denoted deadlines. The following table lists the number of states in the underlying MDP for the different parameter combinations for the flat model, as reported by PRISM:
| flat | (16, 2, 1) | (16, 2, 4) | (64, 5, 1) | (64, 5, 4) |
|---|---|---|---|---|
| Standard | 15 866 | 89 900 | 199 466 | 1 219 940 |
| Deadlines | 791 961 | 3 175 527 | 8 981 283 | 19 821 997 |
The modular model is slightly less efficient:
| modular | (16, 2, 1) | (16, 2, 4) | (64, 5, 1) | (64, 5, 4) |
|---|---|---|---|---|
| Standard | 20 701 | 101 908 | 250 627 | 1 347 700 |
| Deadlines | 1 372 931 | 3 719 147 | 12 035 563 | 24 255 791 |
Performance
The following tables list the model-checking time and memory usage as reported by PRISM. All runs were performed on an Intel Core Duo T9300 (2.5 GHz) system. MC refers to the model construction time.
| flat | (16, 2, 1) | (16, 2, 4) | (64, 5, 1) | (64, 5, 4) | ||||
|---|---|---|---|---|---|---|---|---|
| MC | 1 s | - | 3 s | - | 6 s | - | 22 s | - |
| T1 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| T2 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| TA1 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| TA2 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| MC | 1 s | - | 3 s | - | 6 s | - | 22 s | - |
| PA | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| PB | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| P1 | 7 s | 503 kB | 25 s | 2764 kB | 91 s | 6348 kB | 391 s | 37885 kB |
| P2 | 2 s | 503 kB | 6 s | 2764 kB | 16 s | 6348 kB | 58 s | 37885 kB |
| P3 | 4 s | 500 kB | 13 s | 2764 kB | 64 s | 6348 kB | 247 s | 37885 kB |
| P4 | 1 s | 446 kB | 2 s | 2457 kB | 4 s | 5632 kB | 22 s | 33894 kB |
| MC | 1 s | - | 2 s | - | 5 s | - | 20 s | - |
| Emax | 0 s | 722 kB | 2 s | 4198 kB | 3 s | 9216 kB | 31 s | 57856 kB |
| Emin | 0 s | 722 kB | 2 s | 4198 kB | 3 s | 9216 kB | 23 s | 57856 kB |
| MC | 27 s | - | 89 s | - | 219 s | - | 477 s | - |
| Dmax | 10 s | 21913 kB | 37 s | 91648 kB | 189 s | 258662 kB | 483 s | 619827 kB |
| Dmin | 6 s | 21606 kB | 18 s | 80281 kB | 55 s | 227840 kB | 60 s | 488345 kB |
Again, the modular model is slightly slower and needs some additional memory:
| modular | (16, 2, 1) | (16, 2, 4) | (64, 5, 1) | (64, 5, 4) | ||||
|---|---|---|---|---|---|---|---|---|
| MC | 1 s | - | 4 sd | - | 9 s | - | 31 s | - |
| T1 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| T2 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| TA1 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| TA2 | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| MC | 1 s | - | 4 s | - | 9 s | - | 31 s | - |
| PA | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| PB | 0 s | n/a | 0 s | n/a | 0 s | n/a | 0 s | n/a |
| P1 | 22 s | 727 kB | 103 s | 3584 kB | 295 s | 8908 kB | 1455 s | 47206 kB |
| P2 | 3 s | 727 kB | 10 s | 3584 kB | 20 s | 8908 kB | 89 s | 47206 kB |
| P3 | 6 s | 722 kB | 20 s | 3584 kB | 74 s | 8908 kB | 321 s | 47104 kB |
| P4 | 1 s | 652 kB | 3 s | 3276 kB | 6 s | 7884 kB | 44 s | 42496 kB |
| MC | 1 s | - | 4 s | - | 8 s | - | 29 s | - |
| Emax | 0 s | 916 kB | 2 s | 4710 kB | 5 s | 11264 kB | 41 s | 62976 kB |
| Emin | 1 s | 916 kB | 3 s | 4710 kB | 5 s | 11264 kB | 32 s | 62976 kB |
| MC | 46 s | - | 111 s | - | 284 s | - | 589 s | - |
| Dmax | 81 s | 41369 kB | 57 s | 106905 kB | 595 s | 355328 kB | 778 s | 748339 kB |
| Dmin | 20 s | 38195 kB | 28 s | 94003 kB | 103 s | 305254 kB | 111 s | 597196 kB |
