I discovered this last week when trying to get RCC to work between a Lync Server 2010 Enterprise Edition Front End pool and a Mitel Live Business Gateway (LBG). The environment consisted of a greenfield Lync deployment with a Lync Front End Enterprise Edition pool consisting of two Front End servers that utilises DNS load balancing.
A few problems were initially observed here in the Lync client when we attempted to make a call using RCC:
- Lync places the call and the Mitel handset goes off-hook and dials. However once the call is established, Lync session window does not reflect call duration and does not set presence to “In a Call”.
- If you hang up the call by setting the earpiece down on the Mitel handset, the Lync session window doesn’t close like it should. After a few seconds it generate an error and closes.
- If you try to end the call using the Lync session window, nothing happens and the call does not end.
So basically what is happening here is one way SIP traffic. Let me illustrate and explain what is causing this problem:
- Lync client sends a SIP INFO message that contains the CSTA command MakeCall to the Front End pool (which has a FQDN of pool01.contoso.com).
- One Front End server (in this case, FE1) in the pool passes this to the Mitel LBG to take the handset off-hook and dial the number.
- The LBG tells the 3300 ICP to dial the number.
- The LBG dynamically looks up the pool FQDN of pool01.contoso.com that sent it the SIP traffic and because it can’t cache DNS records like the Lync client can, may receive the IP address of FE1 or FE2 to send return traffic to. If it sends traffic to FE2, this is where the problem begins.
When we run a trace on the Lync Front End Pool, we see the following SIP error message in response to the SIP INFO message the LBG sent FE2:
“ms-diagnostics: 1037;reason=”Previous hop client did not report diagnostic information”;Domain=”contoso.com”;PeerServer=”10.0.10.10″;source=”FQDN of FE2″
Basically this is FE2 saying “hey LBG, you sent me traffic I didn’t generate, I don’t know what you’re talking about” and FE2 drops the return SIP messages. As a result, the Lync client never sees any return SIP messages to change presence to In a Call, show call duration, etc.
I found that there are a few ways around this problem that are scalable. Initially I created a manual entry in the local hosts file of the LBG for the FQDN of the pool to resolve to the IP address of one Front End server only, but once users registered on the second FE and started sending traffic to the LBG, this workaround no longer worked.
Use a load balancer
The only real solution to this is to load balance traffic on port 5060 (or whatever port you are using to communicate with the LBG with) to the Front End servers in the pool. This will mean the LBG only has one IP address to send its traffic to and the load balancer will take care of session stickiness.
Up until now, all my RCC deployments were either with hardware load balanced Front End pools or with Standard Edition servers so I’d never encountered this before. From this discovery I think we can safely conclude that other vendor CSTA gateways like Cisco Unified Presence Server (CUPS), Avaya’s AES, Nortel CS1000 NRS, Genesys GETS, etc that provide RCC functionality for Lync 2010 also do not support DNS load balancing.
So if you’re looking at deploying a Lync Server 2010 Front End Pool and you want to hook it up to your PBX for Remote Call Control, you will need a hardware load balancer of some kind. The Lync supported vendors/models are listed here.
Ok, I think I know too much about RCC now.. 🙂