Deploying a resilient, highly available Lync Server 2010 environment is high on the list of requirements for most organisations during the design phase of their deployment and for companies deploying Group Chat, that’s no exception.
Each Group Chat server can support 20,000 users and you can deploy up to 3 of them in a pool, meeting all but the world’s largest companies requirements for scale. It’s not like this is an everyday deployment scenario, but it will be commonplace for those organisations requiring resiliency for Group Chat (think financial orgs, trading companies, global teams with “follow the sun” operations).
Unfortunately some of the TechNet documentation is a bit unclear around how to successfully deploy a multiple server Group Chat environment but during a recent Microsoft support case, we discovered the right way to do things. In this post I’ll cover how Group Chat servers interact with the Lync topology and how to correctly get your servers up and running.
Understanding a Multiple Server Topology
If we take a look at the Components and Topologies for Group Chat Server TechNet article, we can quickly get an idea of the two (single and multiple server) Group Chat Server topologies that can be deployed. The multiple server topology, as you can see below, has a lot of moving parts:
Lookup Service + Channel Service = Group Chat Server
Essentially a Group Chat Server is just two services that plug into a SQL database – the Lookup and Channel services – and the Web service.
The Lookup Service is how your Lync Front End pool speaks to each Group Chat Server (via the OCSChat@domain.com SIP URI) in the Group Chat pool. Once connected, this service then assigns a Channel Service to the user so they can access their chat rooms.
The Channel Service does most of the heavy lifting of chat room data in Group Chat and also acts as a relay to other Channel Services running on other Group Chat Servers. Finally, the Web Service acts as a front-end for clients to download file attachments in chat rooms.
Just to be clear here, the Lookup Service is able to perform load balancing because it’s just a SIP enabled account, so no need for a hardware load balancer in Group Chat (bonus right?). 🙂
For more in-depth info on how the Group Chat services work, check out this really good scenario article on TechNet (it refers to OCS 2007 R2, but it’s still relevant).
The SQL database is key
As I mentioned in my post about Group Chat migration, the database is the key component in a deployment as it holds all the configuration and critically, the chat room data that your users will create. The Group Chat Configuration Tool connects directly to it and reads/writes configuration to and from the tbl.Config table and the Channel Service connects to it to serve up chat rooms to users. Essentially, each Group Chat server in the topology will connect to this database and become the middle-man between users and the raw chat data stored in SQL.
Making this SQL database highly available (i.e. putting it on a SQL cluster) is important because we don’t want it to be the single point of failure in our topology.
Defining the Group Chat Pool in Topology
When we run the Group Chat Deployment Wizard for the first time, pointing it to a new, empty database, we have to define the Group Chat Pool FQDN. In doing this, the Deployment Wizard creates a new Trusted Application Pool in the Lync topology with this name. Within this trusted application pool in our topology, it creates a trusted application for the first Group Chat server that is a member of the Group Chat pool. As we run the Deployment Wizard on subsequent Group Chat servers, it creates additional trusted applications underneath out Group Chat pool.
So basically within Topology Builder, the Group Chat pool looks like a Lync Front End pool (with servers sitting inside a pool), but it is represented as a trusted application pool because Group Chat isn’t a dedicated, topology aware server role.
Now it gets interesting, because you might think “I just set up DNS and certificate the same way I would a Lync Front End pool”, but that’s not strictly true..
Certificates and DNS for a Group Chat Pool
DNS records should be created the same way you would for a DNS load balancing Front End pool i.e. two A records for the pool FQDN pointing to the IP addresses of the two servers (e.g. gcpool.contoso.com) and an A record for each server’s FQDN. Pretty straightforward, but certificates are where it deviates.
You should get your certificates for Group Chat as per the TechNet article Obtaining certificates but the gotcha here is that you mustn’t have any SANs (subject alternate name) listed on the certificate assigned to your Group Chat servers. If you do, it will cause a divergence in chat rooms and you will hear reports of users only seeing some users in their chat rooms, let me set the scene:
- User A will be in Chat Room 1 and see users B, C and D.
- User E will be in the same chat room (Chat Room 1) but will perhaps see users B, F and H.
Basically, the Group Chat Channel Services don’t talk to each other properly and the member list of a chat room isn’t consistent. The only common name on the certificate assigned to each Group Chat server should be the pool FQDN e.g. gcpool.contoso.com. No SANs. 🙂
Group Chat is a great tool for a lot of businesses and ensuring it is available 24/7 is critical for organisations that rely on getting the same persistent message out to their teams. Making sure it’s resilient and supports your user base is key to providing a good service to the business.
There is a lot involved with getting a multiple server Group Chat topology up and running (especially after you’ve probably had some dramas just getting one server going) but hopefully this post has given you more of an insight into how Group Chat works in Lync Server 2010 and how to get a nice resilient, load balanced environment.