Microsoft Teams Insider

Microsoft Teams Call Quality Dashboard (CQD): Intelligent Classifiers, Silent Test Call and Power BI

Tom Arbuthnot

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 52:24

James Parkes, Senior Product Manager for Microsoft Teams Call Quality Management, Siunie Sutjahjo, Principal Product Manager for Microsoft Teams Meeting, and Victor Guzman, Senior Technical Program Manager, all at Microsoft, discuss the latest advances in Microsoft Teams call quality tools and reporting.

• How CQD has evolved from on-prem monitoring to cloud-scale Power BI reporting with over 45 report pages in the latest v5.3 release

• The new intelligent classifiers: around 40 models that move beyond legacy network averages, using real user feedback as ground truth to pinpoint issues at the device, compute and network level

• How remote and local models identify whether a quality issue originates from a user's machine, their network or a dominant participant affecting others in the call

• A walkthrough of the Power BI QER reports, including the search report, user health details and media health as starting points for troubleshooting

• Silent test call: a full bi-directional media stream test for proactively evaluating network readiness, available for Microsoft Teams Premium users

• Admin-initiated remote log collection: how IT admins can now pull client logs directly from the Teams Admin Centre without relying on end users

Thanks to Neat, this episode's sponsor, for their continued support of Empowering.Cloud

James Parkes: Silent test call, to go back to our little chat about the days of on-prem Skype for Business and Lync and all that, you might... Some users are like, "Oh, I remember synthetic tests back in the day." You'd set up your, your two test accounts on a particular pool, and you'd have, you know, test CSIM and test CSP to PAV and so on and so forth. And so this is different from that. This is not just what... What that was was, was like a handshake, like a dial tone type test. What we've done with silent test call is we have built, this is a full bidirectional exchange of media payload.

Tom Arbuthnot: Hi, and welcome back to the show. This week, we are talking Microsoft Teams call and meeting quality. We're talking Call Quality Dashboard, a new classifier there that's really great. Some new reports as well. We get a tour of the reports. Also, silent test call and remote grabbing of client logs. This session is packed full of information to keep your Microsoft Teams environment, in tip-top condition. A really great tour of the reports. Thanks to James, Siunie, and Victor for giving us the lowdown, and many thanks to Neat, who are the sponsor of this podcast. Really appreciate all their support. On with the show. Hey, everybody. Welcome to the show. Really excited to have this conversation. I was out in Redmond a few weeks ago now, and, we had a, a version of this conversation at MVP Summit, and there was so much to talk about, we said we had to do a podcast. And so we've got everybody, everyone on the show this time. We're gonna get into Teams Reporting, CQD, silent test call, a whole bunch of capabilities, and, we're lucky to have exactly the right people. Siunie, maybe I could start with you. Could you just introduce yourself and, and your role, and then we'll talk through CQD's history together as well?

Siunie Sutjahjo: Hi, my name is Siunie Sutjahjo. Most of you probably have heard my name. I ship CQD, Advanced CQD, though, but internally we call it CQD Version 2 in 2018 and 2019, which introduced the near real time, for CQD data, which improved the Latency from one day to to, like about 30 minutes. And then we actually also introduced the Power BI so our customer can actually, customize their own and take in charge of their own, reporting if they would like to do a customization. Since then, I have been shipping, like, product, like large meeting and real-time analytics, but then I am back to the back end and doing supportability and debugability for our media platform.

Tom Arbuthnot: Awesome. James, do you want to give us a bit of history from your point of view?

James Parkes: Yeah, sure. So, my name is James Parkes. I've been working with Microsoft, started in 2013. No, 20... Yeah, 2013. It's been a long time. So my focus is on supportability for Microsoft Teams, so that's the IT admin, tools such as Call Quality Dashboard, Real-Time Analytics, a lot of the tools we've been building around best pra- best practices checklist, and, the new silent test call feature as well. So that's kind of my focus, and my team was the the team that built the intelligent classifiers that we're gonna be talking about. Awesome.

Victor Guzman: And Victor? So my name is Victor Guzman. I have been at Microsoft for almost 24 years now, so it's been a while. I started working in the field. I was an MCS Consultant back in the day. Started working with LCS, OCS, Lync, different flavors, and, and then Teams. And for the last five years, I've been working in CAPE, which is an engineering- Part of the organization. We, we are customer facing, and we run the Get to Green program. So as part of the Get to Green pro- program, we, we help customers with quality issues, and that's why w- part of some members of my team and, and myself worked on this CQD templates, and, that's, that's where many- Yeah of the reports come from.

Tom Arbuthnot: I'm really excited to hear your perspective because you're actually using these tools to drive customers to green. For, for those who don't know the Get, Get to Green programs, that's a, a program that's been around in Microsoft for a long time in various guises where, like, it's a big customer, it's important, things aren't quite right. You fly in the kind of the, the gun team to come and sort it all out. And, in, in our world, you know, back when it was Lync Server and Skype Server and now, Teams obviously, the, the CQD reporting and the Power BI reports you're gonna show us are a key part of that puzzle. So let- let's, let's level set together 'cause, a, a lot of people in our community will know CQD. I'm sure quite a few of us have sweated over the the Power BI reports in the past. But, this began way back on prem server. And when we were doing the prep call, I've kind of... I blocked out some of those memories of the prem days. But yeah, it was prem first, and then it moved to CQD Online, CQD V3. Siunie, maybe you could take us through some of that history and why, why did it... Why Power BI? Why was that the, the product of choice?

Siunie Sutjahjo: I can even tell you, like, why CQD at the very beginning. Even out of that. Go for it. We, we need CQD because at that time we wanna deploy our product, which is, like, our five products within M- Microsoft IT or within our tenancy. So we need a, a, a tool which actually monitor the whole system, like, and the whole deployment. And as an IT admin, you don't usually like, "Okay, we'll just, deploy for the whole world," but you do, like, probably in a step-by-step kind of cases. So and then you also wanna monitor and then, like, a certain kind of admin also in charge of the one in Europe, the one in US or APAC, something like that. So, so the notion of all the CQD kind of workflow revolves around, helping the IT admin, what we call it before getting to green because it was red all over the place and- Well, and, and, and

Tom Arbuthnot: Our, our product was a real challenge, right? Because going back to those server days, we took dependencies on all the other teams. So- Yes, Exchange... The server infrastructure team, the network team, the WAN, the Wi-Fi, the, the VPN, the split tunneling. Like, there were a ton of things that could trip- Skype server up, virtualization later on. And, and the, the end users just saw at the time Skype, now Teams, and they'd be like, "It doesn't work." It's like, well, I mean, like, it, the, the problem is actually something different, but nobody cares about that. So CQD was one of the things in the arsenal to point out, here's where the, the problem is in terms of performance.

Siunie Sutjahjo: That's why like in the web itself, you know, like, in the old web CQD, we have like the kinda, CAN reporting to kind of monitor the overall, the health of your deployment. But you also, we also have like a detailed report that you can actually have something in mind. Like, I remember like one time there is a snowstorm in Seattle, and the admin was just like, "Oh, excited" to like, "How is it like calling from home? How's the meeting?" And then like they wanna make sure that like, people are still having like a good experience, with their na- with meeting. But they can actually go through the detailed report and then query and see like, oh, okay, you can see like the congested of mobile network and things like that, and what kind of quality. So those are like, pretty, what is it? Powerful tool. Mm. So that's the reason why, we, we understand that like the customization and directly trying to troubleshoot, like do a custom query, is, is important. We add it over there, but then it carries also to the time that we are, we need to do like a Power BI. So why we need to, to do Power BI? We realize that like just doing like a high level reporting is not enough. People want to know like, okay, something is bad, something is dropped 2% or 3%, but where is it? Who are impacted? So those are like the drill down. But on the other hand, when a help desk get called, so for example, like, oh, Siunie is complaining, but they wanna know like, oh, did I do my meeting inside building 31, and is it like there's problem with the building 30 or something like that? Or the Redmond overall, is it like building 31 is like having issues compared to other buildings in Redmond? So those are like the drill up, right? So- And, to re- write all of those in the web UI at that time, it was pretty challenging. So it is easier to just adopt and leverage tools that Microsoft already have, and that- Yeah, so it

Tom Arbuthnot: Turns out you've got a whole product team in-house building a, a data analytics and visualization platform, so that was handy.

Siunie Sutjahjo: Right, right, right. So that's how we actually decided to kind of partner and leverage Power BI.

Tom Arbuthnot: Awesome. And what was it like moving from prem to CQD online? Because it's a different problem, isn't it? Like the- Oh... Building, building a database on a server and then having Power BI connect to it is one thing on-prem, but, like, building at scale is a whole different thing.

Siunie Sutjahjo: Yes, and then, like, we remember the time that, like, we also have on-prem and online, some people already moved even online that in a certain kind of tenant. So CQD is the only tool that actually at that time, provide a glimpse of, like, how your on-prem experience look like and how your online experience look like. And that actually give a lot more, like, confidence to the customer, like admin, to, to kind of like speed up the process because like, oh, the, the online, experience is much better. The same thing like when we move from the Teams, like, old client, to the new one, right? So you, they can actually- Based on data and, or like the, the, the quality, they're like, "Oh, this is really, really good." Like the new TDI or something like it actually expedite. If you have something to compare and you put it like in front of the, the admin face, they were like, you know, it's an easy choice to kinda migrate because if not, then like it's a lot of work for admin. Yeah Yeah. At, at,

Tom Arbuthnot: At the time when we were moving to Cloud, it's, it's a... Well, A, we're trusting Microsoft with the service, but B, we're changing the traffic patterns, aren't we? Yeah. Like we had, we had our WAN, we had our data centers, we were doing QOS a lot back in the day. Now we're basically sending everything out to the Cloud, so knowing that's working for users both in the office and out of the office, that, that report is key. Right. And they manage their

Siunie Sutjahjo: Own, network, right? So they know the quality- Yeah... Of their own on-prem. The, their, their, their the power is in their hand. But if you kinda like trusting Microsoft, they were like, "Oh, how do we trust you? Do you actually really, really good and taking care of us?" Tru- trust,

Tom Arbuthnot: Trust but verify. Correct. And Ja- James, maybe you can take us kind of up to date with what the landscape is now in terms of Teams and the reporting options and, I know you guys have been doing some work on the intelligent classifier as well.

James Parkes: Yeah, certainly. So since we brought out, it would CQD v3, so that's the current steady state of, of CQD as we know it today, we've been building in these new classifiers, and a lot of that is based on what some of the things we're able to do with the Azure Data Explorer backend that all the CQD data lives on. So we've been building these models and, the goal really for the tools is to try and provide some sort of root cause analysis. So in the past we've always thought about, you know, just, you know, provide all the data, provide all the reports, and then let the admins do their jobs in terms of troubleshooting. And we're trying to provide a bit of a helping hand where that's concerned and make the tools easier for people to find where their problems are and spend less of their time doing the troubleshooting and more of it actually fixing it and getting the product to work. And we've been trying to extend that approach to other products as well, within the team stack. So we've got real time analytics, we've expanded to things like best practices configurations, which is the dashboard where we kinda show, you know, the, the types of things we would say, "Hey, before you deploy," back in the day, Skype for Business, "make sure you've got your QOS set up and make sure you've got all your ports forwarded so, you know, we're using UDP instead of TCP." And- So we're really trying to take the admin to a, a point where they're not doing all of this extra legwork to just make the product work. We're identifying in advance and pointing to potential root causes. So we've built-- We've got about forty different classifiers built into CQD now. So we've created all these different models. We've done some testing across. We have quite a wealth of data to work with in the CQD service to train these on. So what we started off, we started looking at the, the legacy classifier. So the audio classifier has, you know, your Jitter Packet Loss Latency, and we found that our false positive for that was quite high. Over time, Teams, the media stack built into the product, has gotten really, really good at mitigating poor network experiences. And so we found we needed something, we need something definitely more precise for admins to rely on than just the old legacy classifier. We still have it around, and it does work for judging network suitability for sending media. But in terms of the actual experience, we wanted something tuned using the user's own perception of quality as ground truth. So when you see that little survey at the end of your call, you know, one-star, five-star, or thumbs up, thumbs down, depending on which client you're currently using, that informs how these classifiers are going to interpret the data that they see. So those are really important, and anytime someone asks me about those, I'm like, "Yes, please answer them." So what we've started off, with for the main indicator is the media modality problem. So if you see detected in, in the CQD dimension name, detected means that's our intelligent classifier, one of them. So we have detected media modality, and that will tell us if we believe that the user would have perceived there to be a quality issue. So that's a... These are all, you know, true/false values. So if you see that's true, then okay, we, we think the user probably had a-- would have perceived a poor experience on that call From there, we can look at our local model. So that'll tell us if that specific user's, ex- you know, if their network was experiencing an issue or their compute was elevated for some reason, we think that was stealing, resources away from the Teams app, their device, so their microphone, their camera, things like that. We can try and pinpoint where exactly we think that impact was coming from on their side.

Tom Arbuthnot: Then we can look at- That's super powerful getting the device information 'cause often that is a, is a problem, particularly if they're on older devices or they've got antivirus running riot with the Teams app or whatever it may be. Like very often it can or it can definitely not be the network. The network can have tons of bandwidth and be ultra low Latency and ultra low Jitter, but you're still being impacted in a different way. Mm-hmm.

Victor Guzman: Yeah.

Tom Arbuthnot: And- Yeah, 100%.

Victor Guzman: And Tom, one, one thing that, that I would like to mention about the new classifiers is that the old classifiers worked on averages. So it was, "Hey, your, your Packet Loss was 1% on average during the whole call. Your call was perfect." And many times when we worked with customers, that was not the case. They had like extreme Packet Loss during- Yeah... Portions of the call. Yeah, yeah, yeah. So they still complained. 1%, 1%

Tom Arbuthnot: Over the hour, but it was 90% for three minutes.

Victor Guzman: E- exactly. And, and that generated a lot of, of conflicts with the customer because they were saying, "Hey, you told me that this call was perfect, and my experience is totally different." So that's also something really important about the cl- new classifiers. They, they're no longer working just on averages, but on, on real events that happened during the call. So it'll take into account those peaks of Packet Loss or that increase in Latency or the user disconnected and reconnected during the call, which is really important to know really what happened. And as, as James was mentioning, the local models are, are really good at determining if the user experienced the issue because if something had happened on their machine. But the remote models will actually tell us if they experienced an issue and caused problems for another user, which is also something that happened a lot. Like y- you will look at, at the user that complained during the call. You go to that user, and everything was perfect. Yeah. I love that. And they were the ones complaining, but other user was the one that having a problem.

Tom Arbuthnot: I love the new identifiers we've got at client side as well that can give you a hover for who has the bad network connectivity 'cause you're right. The person who's complaining is like, "I had a really bad call." What they're complaining about is the person in Starbucks had a bad quality, and I couldn't hear them because they were in Starbucks on Starbucks Wi-Fi. But y- to your point, this, it's not necessarily the person that's causing the issue that reports the ticket. Mm-hmm.

James Parkes: Yeah, absolutely. So it, it's helped us, it's helped us kind of eliminate those red herrings where you're chasing around a user who's not really causing the issue, but they're hearing, like you said, someone's poor Wi-Fi connection over at a coffee shop. - So that's been, that's been really helpful for the whole mission of providing that root cause analysis, reducing the amount of legwork that's needed to diagnose call quality problems.

Tom Arbuthnot: Nice. And for those not familiar, we can get, new PNs or user identities in this dataset so we can find out, who had a problem and, and potentially help them as well.

Victor Guzman: That's, that's a third column there that carried over m- models, right? We identify who are the dominant participants during the call, so which means who is presenting or who is talking the most during the call, and if they experience an issue, we actually send that information to all the other streams to tell, "Hey, this... You are affected because of something that happened with a dominant participant. Not... It's not your fault, you're just affected by this," which helps a lot- Yeah... In troubleshooting.

Siunie Sutjahjo: So again, this is like just to recap why we do this intelligent media classifier is to speed up the troubleshooting, effort. And, you know, we heard from you that even with all the metrics, it's kinda like still very lengthy experience to kinda troubleshoot.

Tom Arbuthnot: Well, that, that sounds like the perfect segway for Victor to show us some reporting and, show us how easy it is to find, find issues in the Power BI reports. Victor, maybe you could show us some of the, the new- Perfect reports and I guess how you drive with customers on the Get green.

Victor Guzman: Yep. So, so these are the, the reports. This is the latest version. It's the version, 5.3, which we just released a couple of weeks ago. As I said, this, these... A lot of these reports, come from just requirements that, that we had when working with customers or that the customers required. So that's how we ended up with 45 different pages for the reports by now. So honestly, I don't use all of them. I, I, I just focus on, on some of the most, what I find most useful when I'm working. But all the reports were, were born because some customer asked for it or we asked for it because, we had a specific need to, to troubleshoot a specific issue. So all of them serve a specific purpose. It's just we, we tend to gravitate toward some of the reports. But I think one of the first reports that, that we use is this one, which is a search report, and basically when, when you get a call from a user that says, "Hey, I, I have a, a bad experience," this is, this is where they should start, right? This is, this is the first, the first step. I

James Parkes: Just wanted to call out something. So we do have a lot of tabs in this report, and Siunie and I were at a conference recently, Comms V Next, down in Denver, and one of the things I noticed was that a lot of those tabs, I think users Believe mistakenly that those are tabs that we're no longer using, so they're, they're flagged as hidden in the report. And those are drill downs. And so those still have purpose. I want people to understand that. And you can go from these pages, right click on a particular value, and go to drill through or drill down, and we'll be able to show you more information about that data point that you're curious about. So I did just wanna cover that off 'cause a lot of people- Mm-hmm see those hidden, and then they're like, "Oh, I can't use that," or, "That's," you know, "just hasn't been pulled out of the report yet." Not the case. Everything in this report has a purpose, and I'm sure Victor will go over that for us.

Victor Guzman: Yeah, I will say the reason why they're hidden is because we don't want people navigating to them. Because if you navigate to a, to a end user report, and you don't have a user selected, you will just get the data for all the users in your organization.

Tom Arbuthnot: And this is kind of one of the challenges of Power BI is a super powerful tool, but it's a, for some people, this is a full-time thing, and for many people, it's a full-time thing building reports and running Power BI. So there's a bit of a hump to get over as a, an admin or a teams person to understand how I drive this. It's worth spending that time getting over that hump because it's super powerful. And even if you pick up just two or three things Victor's gonna show you, like start with these, I think is a great way to go.

Victor Guzman: Yep. So, so really i-in this case, I have the search, and I, I have it drilled down to my user. So you can quickly see some, some things already already here. And this is all using the new classifier. So, so you can see, for example, that for my user, I have eighty-seven sessions, in, in, in this timeframe. And then already... You can already see that some of the issues are with the poor sending device rate and poor incoming compute rate. So it seems like I'm having issues on, on my machine, with, with my compute device and, and I'll, I'll show you some of that in a second. But, one of the things that you can do is actually just select the user, drill through. And this is a user health details report, which is one of the reports that we have hidden in the bottom. And when you do this, it, it'll actually pull that report, but it will have it scoped to that particular user. So this makes it really useful because now I'm looking at my m- just my user, and I can see my experience with Packet Loss, my experience with Latency, the user feedback, the device that I have been using, my, my compute device. If I have used multiple devices, they will show up here. So for example, you can see that I have an iPhone, I have my, my Surface laptop, and actually my Surface laptop is the one that is showing some problems. And, and you can see the same thing by, by week, right? Here you, you can see the days that I have more problems, but it seems like I'm consistently having incoming compute problems. You can quickly see this here. So if I wanted, I can actually expand this table And this will give me a detail of all the calls that I have been in. A lot of information here. It'll tell me when I was the dominant participant, in, in this call, so I, I can focus on those, and when I, I was just another participant. And then we have a section which is colored, where you can actually see, all the classifiers for each one of the streams. So set of failures, mythical failures, media problem. If this turns on, it's, it's because there's a high pos- high probability that it was a bad call for me. None of them trigger, but you can see where I had incoming com- compute problems. So if I select just this one, for example, I can scroll that and, and see that my network in general has been actually pretty good. I, I don't see any, any major issues. My Latency, the highest must, must be these ones with one hundred milliseconds. But in general, I'm below fifty milliseconds for Latency. No issues with Packet Loss. Some, some problems with Packet Loss here. But I can, I can start to see that I'm, I'm, I'm actually not having problems with my, my network. But the classifier trigger was actually compute device, so I can scroll all the way to almost to the end, and you can see here what the problem was, right? I was running my machine at eighty-six percent CPU, max during the call. And with memory also very, very high utilization. And, and you can see that I, I do that a bunch of times. You, you can see that, that, that's not, not an uncommon pattern for me. I, I, I, I work a lot with Power BI, and Power BI does consume a lot of resources. So when I have it open, I'm sharing my screen, many times I, I run out of, of CPU and memory. And you can quickly see it here. So that, that will explain why I, I was seeing that that compute device issue. And, and again, you can quickly see, all the, all the meetings where I was having that problem. You can see it's kind of not, not always, but it's happening significantly. So for example, this one was, was pretty bad, and we can see actually what happened Same thing here. I, I can actually see that there was some, some bad Packet Loss at some point. It's talking about the compute device, so I can still move here. And probably this was a really bad... Yeah, this, this almost, I almost got to ninety percent CPU usage. So that's the power of this of these new classifiers. Yeah, the level of detail you've

Tom Arbuthnot: Got here is so impressive, and, and, and the classifiers are helping us s-sift through this immense amount of data to be like, "Here's where you should start looking," essentially.

Victor Guzman: Yeah, and, and you still need to, to look at all that data to know what's happening. But this, this graph here is actually telling you a pretty good story of what, what my, my issues are. Like, for example, you can see that I actually dropped calls, during this week. And one of the things that, that people also don't know if they don't use, Power BI is that you can actually click on, on, for example, the, the weekly graph and drill down to, to look at the daily view or th- even if an, a, an hourly view to see what, what type of distribution I had. So the, we use this a lot when working with customers to try to determine if they have, for example, congestion during certain times of day, certain days of the week. Like for example, all my, all of my bad calls were probably, happening on, on this day, on the twenty-eighth, and I can see if I, I... If, if I am the user, I, I might know, oh yeah, I had a, a bunch of meetings where I was presenting. I, I had a lot of things open. I probably need to close things down. I have to, to do something else to, to prepare my machine before doing it, right? Yeah, or, or days where

Tom Arbuthnot: I was working from home versus office A versus office B or on the- Yeah... Wi-Fi versus wired in, those kind of variables.

Victor Guzman: E-exactly. That, and that's really easy to see here, on, on the, on the view. You can actually see a lot of information also from, from the network, which I, i-in my case, my network is pretty good, but you can see all the metrics here, which are really useful. Packet Loss, Latency, Jitter, all that stuff. But also a lot of information about where the user is coming. Am, am I using a wire network? Am I using a wireless network? Am I connected from a, a network location in my office, or am I connecting from home? For example, here I'm, I'm connecting from home. And, and you can see all this information at a glance and, and start to determine patterns where, hey, this user, when he works at home, he has bad problems. And we have, we use this to troubleshoot a lot of issues. Like, we have the, for example, helped, customers that their CIO was having really, really bad quality, and he was complaining about Teams all the time, so we went here. What we saw was that, hey, yeah, you... When he's working at home from this particular access point, he has a, a, a really bad experience. And it ended up that he just Put a Wi-Fi extender on a shed in the, in the backyard, and yeah, it, it was a wireless fi- yeah Yeah. I think we, I think we've all

Tom Arbuthnot: Had one of those. Yeah, yeah. It was- Like, no, it's definitely... Or like, like I'm definitely wired in. It's like, well, the stats are saying you're not wired in.

Victor Guzman: Yeah, and he, he had great coverage, like, because the, the extender was sitting right, right beside me. Yeah, yeah, yeah. So Wi-Fi card

Tom Arbuthnot: Was- Yeah, strong Wi-Fi strength- Yeah...

Victor Guzman: But then the interconnect is not so good. Exactly. So, so we are able to see all that stuff and troubleshooting, and that's, that's really why we, we use this report a lot. And we have reports, just, just basically similar reports to this for, for specific things, for meetings, for events, for specific calls, so that you can have this level of details for all of them.

James Parkes: I'm sorry, Victor, one thing I wanted to point out, when you expanded that that listing of all your different streams and, and you were showing the view of the classifiers, one thing I wanna point out where we differ from that legacy audio classifier is we can we can fire off multiple models at a time. So if we think that more than one thing might have been impacting that call... So I think actually the one you had highlighted, if you expand that again, if you go back over to the model view. Yeah. The one... Yeah, so the, the exact row you've got highlighted there. So we've got media problem, inbound problem, incoming compute. I thought we, we had one where we had... Yeah, so inbound problem would be your network and then inbound compute. So both of those are highlighted. So we think that more than one thing might have been contributing to the, to the poor call issue. So you have multiple places that you can check. So it's nice because we're not just troubleshooting one thing and then, you know, the problem comes back and I thought we fixed that. So we can show a few different, issues with just one, with one stream.

Victor Guzman: And, and really I, I filter just to this one so, so you can, you can quickly see it. But you have the media problem here, you have the inbound problem and the compute problem. And, you can see here immediately that there's, there's a JLO here with a peak of Jitter causing problems. And then you can see also, the CPU issue that I, that we're talking about with 90%, system CPU u-use. So again, it's, it's really easy when, when you see this fire up, as, as James was saying, just, just to zoom in and see what's happening. And, and we have a lot of reports here, or, or filters here that can help with that. Like if, if you want to see how the user is doing in Wi-Fi versus wire, when he had a poor call or not, what type of media. So all these, these filters really help just narrow down the, the experience for the user.

Tom Arbuthnot: Where would we start, Victor? Like, that's the, like, I wanna zoom in on a user. If I wanted to know kind of how the environment is in general or where I should spend my time, where would you suggest people start on the reports in that point of view?

Victor Guzman: Really, we have a Media Health, report. That's, that's where we should start always when we're talking about, like... There, there are two workstreams of how to use this. One is from, from the user perspective. Like, I, I get an escalation, I want to, to start troubleshooting something, because the user called me and they, they have a bad problem, then we start it from, from that report. Or the other option is, is here. Like, I, I, I join my... I, I, I'm working on this, I want to improve, the experience for my users, this is the report that I want to use, which is the, the Media Health. And then if something here comes up, then I can actually drill down to specific things like the media setup if I'm seeing a high set of failures, or the media reliability if I'm seeing dropped calls, or the audio health, audio health details if I want to, to just, just look at that thing in particular. So this is the one that I always start with. There are other reports here that are, are important, like, for example, Transport, which will tell me how, how many of my users are still not using UDP, or VPN usage, which was a big deal back in COVID because users moved, to working from home and, and they had a lot of problems. The VDI report's also great if you have issues with your VDI environment. But really this is the, the place where I start, and then you can start filtering here for buildings, filtering for, for specific cities, filtering to specific countries. And I have customers that actually have this view, already pre-filtered to specific countries and published so that their network admins in different locations can just look at the scoped view of their tenant.

Tom Arbuthnot: Yeah, it turns out Teams is quite a good, quite a good network test on an ongoing basis. We'll talk about silent calling in a minute actually, but, like, like, just the users using it all the time. I remember we were often the team that would be banging on the door of the network, yeah, when something had accidentally got rerouted or, or changed. You see it very fast in this data.

Victor Guzman: Yeah, it was a canary in the coalmine because that's the first thing. Real-time audio is the first thing that gets affected. - Yeah... The other workloads are more resilient.

Siunie Sutjahjo: And there's more. With this report, we actually also have Teams Event. So, I know that Victor mentioned about the VDI and, other, other, reports, but we in, with all the selection of tabs in Power BI, we actually also have Teams Event, which you can actually also sort it from the search. If you wanna take a look at how's your Teams event- Look like including, like even our old, our previous, TLE. So we... Based on the meeting type, you can also look for the quality, and then not only that, like if you have a new certain kind of subnet in your environment, you can also filter and search based on a certain kind of subnet.

Tom Arbuthnot: Nice. And does this report on Teams, obviously the Teams events, the Teams traffic, but does it, does it report on the ECDN, e- element as well?

Victor Guzman: I, I will say it's not the... You don't expect the same level of detail that we have for, for the Teams, media, but we do report on that.

Tom Arbuthnot: That's nice. Yeah, I, I think a lot of people have missed that's now included with the recent licensing changes in the, in the Teams call capability, so a nice value add there to get some reporting as well. We

Siunie Sutjahjo: Can have a separate, podcast for Teams event next time, Tom.

Tom Arbuthnot: Yeah, I think we might have to. I, I've, I, I've, I've already mentally backlogged a a full session with Victor going through his his tour of all 44 reports, so, Right. Yeah, yeah.

Siunie Sutjahjo: Like our Teams event is actually include the presenter and attendee, so we have of course for the attendee, which is in the streaming technology which is different than our Teams meeting, has a different kind of metrics and, details, but we do complete, both, both, pres- participants in one report so you can take a look at that- Yeah, very,

Tom Arbuthnot: Very often those events are important because they're, they're big managed events. So yeah, knowing, knowing what happened and, and, and, and I think being able to proactively report on the back end, you know, there might have been thousands, tens of thousands in, in Teams events now. So being able to proactively report on the back end that it all went well is a really nice thing.

Siunie Sutjahjo: Yeah.

Tom Arbuthnot: So, I hinted at it there in in Victor's slot, but James, talk us through Silent Test Call, 'cause I know you're working on that, and I'm really excited about that capability. I feel like we back in my consulting days, we built something with UCMA to do test call bots and, this feels like a much, a, a, a polished in the box, option now.

James Parkes: For sure, yeah. So Silent Test Call, to go back to our little chat about the days of on-prem Skype for Business and Lync and all that, you might... Some users will be like, "Oh, I remember synthetic tests back in the day." You'd set up your, your two test accounts on a particular pool, and you'd have, you know, test CSIM and test CSP to PAV and so on and so forth. And so this is different from that. This is not just what... What that was, was, was like a handshake, like a dial tone type test. What we've done with Silent Test Call is we have built... This is a full bidirectional exchange of media payload So this is not just, you know, "Can you hear me?" "Yeah." "Good." And move on. This is 60 seconds, minimum of 500 packet utilization to determine how the network is gonna perform when we're performing a Teams call. Right now we're doing bi-directional audio. Bi-directional video and VBSS is coming. We currently support native Windows and Mac clients, mobile, MTRA, MTRW, so the Microsoft Teams Room systems, those are in the backlog. We're working on that as well.

Tom Arbuthnot: I was, I was gonna ask. I'm, I'm I'm, I'm championing that on multiple fronts at the moment 'cause I think that'd be an awesome addition to the Rooms portfolio to have us, running silent media streams in the background when they're not in use.

James Parkes: Yeah, absolutely, and, and we knew that was gonna be a big draw for customers as well, being able to test those, those Room systems. The way it works, and, and I think some people when they hear that we're gonna be able to test Room systems, they get a little excited. Like, "Oh, we're gonna have... You know, we're gonna know if the camera's working. We're gonna know if the mic is working." The way we've built it today for privacy reasons is we're all using virtual devices. So there's no... Your camera isn't turning on, your mic isn't turning on, your speakers aren't doing anything. This is all synthetic media that it's a real media stream, but it's not being generated by your capture devices

Tom Arbuthnot: In any way. Yeah. No, that, that's a, that's a really important call out actually 'cause this is, this is the best kind of testing in the sense of it's exactly where the real user is using the real user machine on the real user endpoint, so all the upside of that. But yeah, you obviously, for, for the non-technical people, it's not opening the mic and just spitting up a call. It's a, it's a synthetic, emulation

James Parkes: Of the media load. Exactly. So we're still able to see what the network is going to be doing when we're sending that media payload across. And that's really why we wanted to build the tool, was to be able to identify if a network was going to be ready to run Teams in the way it was configured, when the customer went live. So, basically what you can do is you can target a subnet within your deployment or multiple subnets. We're working on adding building targeting. Again, for those customers who have uploaded the building data file, they'll be able to enjoy that once that comes along. But essentially targeting your subnets and then being able to recruit all of the Teams Premium licensed users on that subnet to run this bi-directional test with a bot that we have in the Microsoft data center. So- Yeah, and that

Tom Arbuthnot: Is, that is an important call-out. So it's for Teams- Yeah... Premium users that you can target

James Parkes: And then create this capability. Yeah. We support conferencing. We also support Teams Town Hall, and that is enabled using the Team, I think... What is it? The Teams Town Hall license? Is it Town Hall or To- I'm not a licensing guy, but-

Tom Arbuthnot: Yeah, I don't know 'cause it was like, it was in Events, but now I think it's just in Teams Enterprise because the Event stuff moved to- I believe that's it, yeah the core license, yeah.

James Parkes: Yeah. So, so that, that's also supported as well. - And yeah, so one of the things that we wanted to make sure we did is not, interrupt calls that are in progress, or if you start a call, we want Teams to be ready to go. So if you're in a call, the test won't run, and if you are going to start a call and the test is running in the background, it'll end the call gracefully so you can hop on your own conference and you're not interrupted.

Tom Arbuthnot: That's

James Parkes: So

Tom Arbuthnot: Cool. I see for the Rooms scenarios on Windows Android, you're targeting doing a emulated video stream as well, so we'll get a bigger load of traffic, which is

James Parkes: Great. For sure, yeah. And once, once those other two modalities come along, then we'll be able to do that for real time as well. Like for- Oh, awesome real time protocol as well. Yeah. Very cool. Yeah. So the goal is to kinda simulate all your users are using Teams as they normally would throughout the day. The, it's the upper, the upper of normal usage is- Yeah... The one we're trying to test, right? We don't... This is not a stress test. We're not trying to see how close you can come to DDOS-ing yourself, but we're trying to, you know, evaluate is the network suitable. That's really what we're targeting here. So- Yeah... We do limit to two tests per day is the most frequent that you can do these calls. We have, we have had a couple customers who asked for more than that, and the reality is, is if everyone starts doing that, everyone's gonna have problems. So we're, we're pulling the rein- pulling back on the reins a little bit with how frequently you can do these, since these are full media payloads.

Tom Arbuthnot: Yeah. Yeah. And you mentioned new scenarios, which is obviously the, the, the first idea that comes to mind of like, we've set this new building up or this new location up, like, but I, I can definitely see this being part of routine testing for some customers, particularly if you're the Teams team in big enterprises, you don't have direct access to what's going on with the network team or the facilities team, whatever it may be. Mm-hmm. This is a great way you as the Teams team can take control of a testing scenario and get your own results without needing budget from the other team to do the test or whatever it may be. I'd suggest if you're doing this kind of test, might be good, flagging to your friends in networks you're gonna be doing it. But, so really nice to have this in kind of the team side of the box of toolkit, if that makes sense.

James Parkes: Yeah. It's a smart thing to add to any process when you're doing a change advisory board and you're going through, "Hey, we're doing this on the network," or, this, you know, switch and this building is going to be, you know, going to have a firmware update. Great. Teams team can go and schedule one of these tests for once that change has been pushed to production, and you can get some feedback on, "Hey, we noticed that this particular telemetry has changed since this change- Mm-hmm... Was pushed. Maybe we should investigate this or roll it back or," you know. So it's, it's a great way to make sure that Teams continues to run optimally even when you're making those changes in other systems that may impact it. One thing that I did want to call out- About this is the users do have to be logged into the client. So we have had some requests about things like, you know, "All my, my m-machines are logged out, you know, can we still run the test?" And, and the answer is no. It's a, it's a security concern for starters. If I'm logged out of my application, no one should be able to take advantage or take control of it. The other thing is simply who's authoritative for, you know, the client experience when I could log into several different tenants. We have that tenant switching ability, built into the client. So for that reason, if the client is logged in, great, it'll execute the test for you in the context of that user. Mm-hmm. And we'll be able to get the data. If the client's logged out, it's logged out, it's not gonna do anything. So

Tom Arbuthnot: Just something and when we come to look at this data, James, are we using the same tooling we talked about before? We're using the Power BI reports, we're using CQD?

James Parkes: Yeah. So all of that data, it... Basically this is a, this is a Teams call when you think about it at its, at its heart. So all the telemetry we get is the same telemetry we will get from a full-fledged Teams call. And there is a, there is a difference here in that we are not using the intelligent classifiers for this. The reason why is because we've built those classifiers off of ground truth, from the user, so the user's experience, and this is headless. We don't have that. No one's rating five star, one star, thumbs up, thumbs down. So, the... And the other thing is this is a network test. We are evaluating how the network's performing, so that's our Jitter, Packet Loss, Latency, and that's something that this works really, really well for, those legacy classifiers. So when you go into this, you will see how the network was performing, how the streams were being measured in terms of their Jitter, their Latency, Packet Loss, et cetera. And yeah, so you can see on the slide here, we can drill down into specific subnets and see where we had that elevated poor call, poor stream count, and figure out what might be causing that.

Tom Arbuthnot: Very good. I'm excited for the the Room stuff. We'll have to talk again when that's when that's GA, 'cause that'll be a nice, to have those permanent endpoints in those locations, and they're such important endpoints, knowing I can see using silent test call a lot more often in, in that scenario. Absolutely. You've got

James Parkes: Something that's always logged

Tom Arbuthnot: In

James Parkes: And available, right?

Siunie Sutjahjo: And as what Jim said, like there are scenarios that he listed over there, like for example, we attempted like testing like, if, there is a new firmware, a new thing that you are going to deploy, deploy. But in addition to that, like if you are, for example, there's a certain kind of sector in your deployment only use Team Chat, but you wanna move them to Teams Meeting, or in- add the license on top of that, how do you sure? Like, because you never, you don't wanna just enable them and then, the, the network apparently is not ready. So this is another scenario that you can actually use to kinda onboard people who hasn't been in Teams meeting. Do some, because even with Teams chat, they're using the Teams just for chat, you can actually use it, use the silent test to assess their network readiness and then, get them, ready for Teams meeting. So.

Tom Arbuthnot: Oh, so before you're, yeah, if you're, if you're stepping up to using Teams for meetings or Teams for phone, this is a good way to check the- Correct... The network and the environment

Siunie Sutjahjo: Really. Yeah, or if you use something else, and then you wanna try, like, "Oh, are we ready, for Teams meeting?" Then you can use this tool as well to test, your environment. Yeah, we look forward to, to, for you guys to give us feedback as well because there's a lot of potential with this tool.

Tom Arbuthnot: Yeah, really exciting tool. And the last thing we wanna talk about on this show, and I feel like we've got a few spinoff ideas for some, some future shows for sure, is talking about the remote log collection. I feel like this hasn't got enough, enough attention. It's a thing that used to cause a lot of pain for me when I was consulting and doing support. Siunie, maybe you can take us through what remote log collection is and how it works.

Siunie Sutjahjo: So, the story is like this, right? Internally, we have what we call it, like, a report the problem internal, tools, which actually automatically load, the, the, the logs, for our engineering to kinda debug. The problem is sometimes we are in a meeting, and then I'm, I'm giving you a true story that happened in, in, in Microsoft. So there's somebody who actually complained that every 10:30 AM, he has experiencing, like, a problem with his meeting. Like, everything is, like, like, choppy and, and then- But it's not repeatable. It only happened, like, in a very short period of time. And, since, like, something like CQD, it's like averaging the whole meeting and, capture only the max, Jitter or the network performance, we don't know, like, what is going on unless we have the detailed logs of what is going on. So, apparently, the funny thing is, like, our lunch start being served ele- at ele- at 11 o'clock. So the cafeteria start, like, opening and, like, working on our lunch, and every microwave n- in that, like, cafeteria is actually on, and it happens to be that person's, like, office shared the same wall with the, the Microsoft cafeteria. So that's the reason why it's very transient, and it's only happening, like, during a certain kind of time, and then it's gone, and it's back, and it's gone, and it's back. But we can only do that, like assess the, the, the problem through this log collection. So, but we have it internally. So the story is, like, our customer also have those kind of transient issues sometimes. But to file a, a support ticket to Microsoft, they have to collect logs, and it was a dragging experience from the IT admin, at least that's what they were sharing it with me, to ask their execute, exec to kind of collect the logs for them.

Tom Arbuthnot: Yeah, it's really hard, and it's like it's a... You're, they're already having a problem, and now you're like, "Can you take more time out of your day to click this, click this-" Yeah... "send this, drop this?" Open this

Siunie Sutjahjo: Tab and send it to me. Yeah.

Tom Arbuthnot: And

Siunie Sutjahjo: Maybe-

Tom Arbuthnot: And we're often not in the same physical location when we're troubleshooting. We might not even be in the same country, so trying to get those definitely was challenging.

Siunie Sutjahjo: Correct, and then the exec is super busy. They might send it to you three days after, and that's not the log you want. So again, this is, like, the challenges of, of the life of the IT admin, and I surely understand. So this admin initiated remote log collection trying to address that problem. So if you are actually having a certain kind of issue, and, your user, has a certain kind of iss- issue, especially your exec has an issue, then you can actually be the one who collect the logs. And then you can actually later on file the ticket to Microsoft or something or do something with it but it's already ready. So, you can send it when it's actually happened, when the, the exec actually report it to you, and then you collect the logs Then you make sure that it is a fresh logs. So that's-

Tom Arbuthnot: And, and how's, how's that work? It's Teams Admin Center where we can kind of kick that process off, is that right?

Siunie Sutjahjo: Yes, it looks like that. It's as easy as, like, "Hey, request a client log," and then within, like, just wait a little bit, the logs will be collected for you, and you can actually- Amazing. So it's a good- Try it. So- Yeah, some people actually say, like, "Why? Why don't you just send it directly?" Well, the end user has the ability to do that through the report the problem, but the log has to be sent to us, but, so you have to check it. But this way, some of the admin actually don't want the end, their end user to send anything directly to Microsoft, so because they are very cautious. This way they can actually make sure that, like, the, it is the client log that the, with the telemetry that's being sent to Microsoft so that we give the control to the admin hand instead of just the end user. Yes, and if they're

Tom Arbuthnot: Working, if they're working a ticket with Microsoft, the, the, the Microsoft support is gonna ask the person in, on IT who raised the ticket, so being in that flow of, "Here are the logs. I've got them," is makes sense.

Siunie Sutjahjo: Yeah. So that is the, the exciting tool about the admin-initiated, remote log collection.

Tom Arbuthnot: Amazing. Well, we have covered so much in this show. I feel like, we, we, we, we definitely have to come back and do a, a another deep dive, and I'm excited for the advances in the silent test call. And, yeah, Victor, if you're up for it, we'll do a a longer session on your, your CQD tool, because I feel like- Yeah, sure there's so much power in there.

James Parkes: Yeah.

Tom Arbuthnot: Awesome. Well, Vic- Victor, Siunie, James, thanks so much for, for joining. Great to catch up as well, and, we'll, we'll talk to you all again soon. And we'll share links for how to get started on CQD and all the docs about the, the, the new capabilities as

Siunie Sutjahjo: Well. Yeah, and then there is also, like, the tech community.

Tom Arbuthnot: Let's link to that as well, yeah, 'cause, that's a great place to get started.

Siunie Sutjahjo: Yeah. Thank you so much-

Tom Arbuthnot: Awesome. Thanks so much... Tom,

Siunie Sutjahjo: For hosting us, and thanks for the invite to share our tools with your audience.

Tom Arbuthnot: Awesome. Thanks a lot.