If 2021 has taught us anything, it is that what sounds like a consensus online is often not. Our world is dominated by algorithms whose output has shown the ability to skew our realities. Bad actors have discovered they can influence algorithms and they do so for financial gain or just a laugh. Artificial Intelligence (AI) can provide great value, but AI with bias and/or inaccuracy is something we must actively guard against. This post is going to explore the traps related to user feedback and how over reliance on that dataset can result in poor outcomes for any AI, but especially for chatbots and digital assistants which are your first line of support for your users.

For the purposes of this post, we will focus our examples on use cases we typically see our customers facing. Users, in this context, are the ones chatting with the bot and looking for support.

What is User Feedback?

User Feedback is a broad term meant to cover both direct and indirect feedback. Direct feedback is when the user is asked for their opinion directly and they reply. You will see this in various forms. For example, the thumbs up and down icons are meant to collect user feedback. You may be asked, “Did this solve your issue?” or “How would you rate this experience?”. Have you seen those buttons at a store’s exit where there is a smiley face, a sad face and something in between? That is a form of direct user feedback.

customer satisfaction buttons

The other type of feedback is far more subtle and indirect. We can look at a user’s actions and from those infer some level of feedback. These patterns can also be called user cues. An example of such a cue is when the user gets an answer and they respond, “you stink!”. The implication is that the user is unhappy about the previous answer. Another cue can be the circumstances under which a user clicks a help button or even asks to speak to a live agent. All of these indicate something may have gone wrong.

The Feedback Challenge

There is no problem with asking for feedback. In general, it is a good practice. There are some challenges, however, so let’s explore those.

Interpreting User Intent

Interpreting the user’s intended meaning is no easy task. Let’s focus in on a simple interaction to illustrate this point. With many help desk systems, upon completion of the experience, the user will be asked: Did this solve your issue?

Let’s imagine a digital assistant experience…

How much PTO can I borrow?

All our policies can be found in the Policy Center.

The user gives a resounding “NO” to the follow up question, “did this solve your issue?”. The problem is, we don’t really know why it didn’t solve their issue. If we present them with a big, long survey trying to find out why…well, you know no one is spending time on that. Back to the point at hand, there could be all sorts of reasons for the “NO”. For example…

  • They are annoyed because the bot didn’t answer directly. It simply gave a link and it is now the user’s problem to find the answer.
  • The user may have found the policy on borrowing PTO, but disagreed with the policy itself, thereby not solving the issue at hand.
  • The user may be unhappy that they are getting an answer about policies seemingly unrelated to the question which was about how much PTO can be borrowed.
  • The user is a bad actor and intentionally provides inaccurate feedback.

Experts say the key to effective user feedback is acting on it. However, the confusion around user intent puts you on a steep slope when trying to act.

Selection

The next problem with user feedback is that many studies suggest the data is not representative of the user community.

Anecdotal evidence from across the web suggests a typical response rate for an online survey is much lower than 10%. That means the vast majority of your customers (90%+ ) are not telling you what they think. You might be able to argue that away statistically, but in reality are you happy that so many of your customers don’t have a voice?

customerthermometer.com

We know user feedback tends to have a self-selecting effect. That is to say, the people who participate skew the data away from a true representation of the whole community. The most basic example of this is that unhappy people provide more feedback than happy people. This makes it very difficult to act on a dataset which lacks representation.

Intentional Manipulation

Famously, Microsoft released a bot using their AI to Twitter in 2016, a time when we didn’t fully understand the world of unintended consequences in AI. Without too much detail, let’s say this experiment did not go well. “The more you chat with Tay, said Microsoft, the smarter it gets.” Have you heard this before?

It is a case where users figured out they could influence the AI and they knowingly did so. We know humans are capable of this manipulation. Despite our speculation as to their intentions, we need to actively guard against it. So how does one know the difference between this manipulation and genuine user feedback? If Facebook and Twitter haven’t been able to tell the difference, we should be cautious in thinking we can.

IntraSee’s Feedback Data

Across our customers, many have deployed quick feedback mechanisms like thumbs or star ratings. This feedback is non-interruptive, and the user is not forced to answer. For this type of asynchronous feedback, we are seeing a 3%-4% response rate.

We will also collect feedback that is more synchronous, which the user can ignore, but it is not readily obvious they can continue without providing feedback. This method is getting about a 40% response rate +/- 7%. Clearly, more feedback is gathered with this method, but it can be annoying. To counter that potential frustration no user is asked too often. There is a delicate balance between getting feedback and being bothersome, but we feel throttling is necessary here even though it reduces the data significance.

For one customer, the asynchronous feedback (thumbs/stars) happens 7.5 times as much. Doing the math, we get almost the same amount (+/- 5%) of feedback data from both models!

Automating AI with User Feedback

We now understand that feedback, while valuable, can produce bad outcomes if you are not careful. It is hard to collect, it is often not representative and interpretation is rife with miscalculations. In the chatbot industry, there is a technique which will take user feedback data and feed it into the AI model, but doesn’t that sound problematic when our confidence of this feedback is on shaky ground? Remember how Microsoft said to just use it and it will get better?

Machine Learning AI is the most powerful type of engine behind enterprise-grade digital assistants. That AI uses a model that is trained with data just like a Tesla uses pictures of stop signs to understand when to stop. When we hear, “just use it and it will get better,” what is really happening is the training data is improving which should yield better outcomes. That is, of course, if the training data is of high quality.

How does training data improve? Two traditional ways: manually by a data scientist or automatically. How do you automatically update training data? You need to draw upon data sources, so why not use user feedback? For example, if a user clicks the thumbs down, we can assume the AI had a bad outcome, right?

It sounds like a good idea, but it can be a trap! As previously discussed, we see this data collected < 4% of interactions. Imagine you have 1,000 questions in your bot and get 10,000 user questions in a month. If every question was asked an equal amount of time, that would be 4 pieces of feedback per question! How many months do you need to wait before the feedback has data significance? This effect is even more pronounced if the question is not a top 20 popular question.

It's a trap!

Now consider you wait 6 months to have enough feedback to act on it automatically. What has changed in 6 months? The pandemic has taught us that everything can change! By the time you have enough data, that same data may be stale or, worse, incorrect.

This math all assumes feedback data is good and evenly representative, but as discussed above, we know it is not. Oh my, what a mess! We now have limited data, and it is overrepresented by the unhappy and we are considering automatically amplifying their voice into the AI model?

Time for another practical example.

Do I need another vaccine?

Information about health and wellness can be found by contacting the Wellness Center at 800-555-5555.

This answer isn’t wrong, but there is a better answer which specifically talks about booster shot requirements. The user doesn’t know this answer exists, so logically they click thumbs up or answer “yes” to the question, “did this answer your question?”

If we took this indirect user feedback and automatically fed it into the AI, we would be telling the AI you were right to give this less-than-perfect answer. The system is then automatically reinforcing the wrong outcome. Now amplify this by thousands of interactions and what happens? The AI drowns out the more helpful answer about booster shots. The end result of this slippery slope is continual degradation in the quality of service the user receives.

What’s the Solution?

This is a nuanced problem we spend time thinking about so our customers don’t have to. One solution is to not abandon the human touch. The dirty little secret about Alexa and Siri is that they have thousands of people contributing to the AI by tagging real life interactions. If Apple and Amazon still need the human touch in their AI, then it is probably for good reason.

When teachers teach students, they are curating the experience. Teachers don’t simply ask students, “do you feel you got this test question correct?”. They are grading those tests based on their expertise. Asking students to be the grader is flawed.

While we cannot discuss all our tricks, at IntraSee we will be introducing some new technology in 2022 directly aimed at this challenge. The lesson learned here is that while automating the data that feeds an AI model can be powerful, it is a power that comes with great responsibility. Ask your AI vendors how they solve this challenge. For our customers, these challenges are our problem at IntraSee, not yours. Rest assured, we are all over the challenges so you don’t have to spend a minute on them 😀

Contact Us

Every industrial revolution has been defined by increased efficiency and reduced costs. The new digital revolution we are embarking upon is no different. Things that took days to do can now be done in seconds, and things that used to cost hundreds of dollars can now be accomplished by spending less than one dollar.

Conversational AI is cool, but that’s not why it will change the world. It will change the world because it will be better and cheaper than many of the things we pay humans to do today.

In this blog we will focus on the impact of digital assistants in the world of human resources (HR). And how it will change how organizations can service requests and questions from employees and managers in a way that reduces organizational costs and improves the level of service. We will therefore break down the two areas that should result in large reductions in operating costs: the HR help desk and HR staffing levels.

What you will see is that even the most conservative approach to saving costs with a digital assistant will realize between a 10%-30% reduction in help desk and HR costs in one year. And that can be doubled in two years. Plus, you’ll be providing better service to your employees and managers too!  

As Larry Ellison pointed out last year at Oracle OpenWorld. It’s not the software that is the most expensive item, it’s the cost of all the people who have to deal with all the ramifications of running the software. 

1.   HR Help Desk Costs

It has been said that help desks are the cost of (a lack of) quality. Scattered, and often misleading, information and complex processes inevitably force employees to reach out to live agents to help them solve their problems, answer their questions, or complete a task. Help desks are, often, the cost organizations pay for failures elsewhere in their internal systems. 

So, let’s break down the staffing costs of a help desk in order to drive to an expected cost saving:

The average number of service agents per 1,000 seats ranges from 5.4 in the healthcare industry to 21.9 in the financial services industry. This is the metric that defines the staffing levels of an organizations help desk. 10 per 1000 would be a conservative average number across all industries, and so we will use that for the model in this exercise. 

This means that for an organization with around 20,000 employees, the number of agents is around 20. In North America the average salary for a service desk analyst is $41,000. Multiple that by 20 and you get $820,000 per year. But that in itself is not the complete picture. 

The ratio of agents to total service desk headcount is a measure of managerial efficiency. The average for this metric worldwide is about 78%. What this means is that 78% of service desk personnel are in direct, customer facing service roles. The remaining 22% are supervisors, team leads, trainers, schedulers, QA/QC personnel, etc. And those people are even more expensive. This takes the headcount up to 25.

The average salary for a service desk supervisor is $61,000 and the average for a service desk manager is $75,000. Which means that those extra 5 people push the staffing costs up by at least $305,000. Driving the total cost of salaries staffing the service desk for an organization of 20,000 up to $1,125,000. And then when you factor in utilities, technology and facility expenses this raises the number to over $1,325,000. 

And one final statistic to keep in mind. While the average overall employee turnover for all industries is 15%, inbound customer service centers have a turnover rate on average of 30-45%. It should come as no surprise that service center turnover is at least double what you’d see in other businesses.

Based on age, the differences are stark: workers age 20-24 stay in the job usually just 1.1 years, while workers 25-34 stay 2.7 years on average. 

And the key metric here is that it costs on average around $12,000 to replace agents that leave. Why? The costs of turnover include the following:

  • Recruiting
  • Hiring time (HR time, interview time)
  • Training, including materials and time
  • Low-productivity time when employees first start out
  • Supervisory time
  • Overtime (remaining staff may have to cover extra shifts)

So, going back to our original metric of 20 service desk agents, if 40% leave each year, that equals 8 annual replacements at a total turnover cost of 8*$12,000=$96,000. 

So, as a grand total, an organization of 20,000 employees has to pay an annual cost of around $1,421,000 year to staff their service desk. 

In terms of how cost per ticket is calculated (a key metric), this also depends on the number of tickets closed per agent per year. Again, this varies a lot by industry.

Help desk tickets per agent, per month by industry

Figure 1: Tickets closed per agent per month

The average number of tickets closed per month per agent is around 120 cross-industry. So, per year it is 1,440. Which means that with 20 agents the expected number of cases closed (not always successfully) is 28,800. 

This means that the average cost per service ticket is around the $49 mark ($1,421,000 / 28,800) if you take into account a broader range of costs than just service agent salaries.  So, while generally published average costs per ticket are estimated to be around $19-$20 per ticket, the true cost is much higher, but with massive variance based on industry. 

The good news is that the actual logistics around achieving ROI are therefore pretty straight forward. Instead of hiring 40% new staff every year due to attrition, just have the digital assistant pick up the slack and do not hire any new staff. This immediately saves your organization $96,000 in turnover/onboarding costs. Plus allows you to drop the salary costs by around $450,000 (40% of a $1,125,000 payroll). 

It also allows you to reduce other costs associated with your help desk. Utilities, technology costs, and facility space (you can downsize based on the reduced headcount). 

For a digital assistant, the average cost per ticket is less than $1. Which means that if you replaced 40% of your help desk calls with digital assistant calls, this would result in digital assistant costs of less than $11,000.  Factor in a reduction in headcount and other expenses, plus a hiring freeze, and you would see an overall reduction in costs from $1,421,000 to $806,000 in just one year (see diagram below). And even greater savings after two years.  

Help desk cost infographic

Figure 2: HR Help Desk ROI using a Digital Assistant

Also, and just as importantly, the quality and accuracy of the digital assistant will continue to increase each subsequent year and will not plateau (as it does with humans). This is due to two factors:

  • Digital assistants don’t leave your organization. There is zero turnover. 
  • Digital assistants benefit from machine learning. The more they see and the better training they are given, the more accurate they get. As an investment, they are a win-win all round. You teach them something once, and they remember forever. And they’ll never leave you or call in sick. And they’ll work 24/7, 365 days of the year. And can even speak multiple languages. 

But this is not where the story of ROI ends, it’s really where it begins. Help desks are really designed to handle the easy, first level stuff. Once you get to the next level (where the agent can’t handle the ticket because it’s too complex for them), the costs are in the hundreds of dollars per ticket as you are now dealing with a more expensive level of staffing and more minutes required to solve the problem or meet the request. This is where HR staffing levels come into play. 

2.   HR Staffing Costs

In the world of HR, HR experts handle many of the day to day HR activities and employee/manager requests in an organization. In the same way that there are agent staffing levels per industry, there are also HR staff to employee ratios too. And this ratio does vary per industry. Typically, the more complex the organization the higher the staffing level. But size matters too. There are economies of scale that kick in once an organization gets really big. But being global, having a mix of full time and part time employees, union and non-union, blue collar and white collar, will dictate higher ratios than a company where most people fit a similar profile. 

But this does not mean that the ratio is stuck and cannot be changed. There is one key aspect of the staffing ratio that is in complete control of HR, and, therefore, has a huge capacity for change. And by change we mean reduced! 

The role of HR is a key variable factor that influences the HR staff to employee ratio. A highly operational HR department will do different work and require a larger HR workforce compared to a highly strategic HR department. So, what specifically does this mean? How can HR move from being mostly operational to being mostly strategic (a much more fun and productive role btw). 

The answer is to move traditional HR admin tasks from humans to a digital assistant.  HR admin work is probably the least popular thing that any highly educated and highly paid HR expert has to do, so removing this onerous work from their plate is a good thing! 

Running reports, answering requests for data, following up with managers to ensure key tasks were performed, entering data into the HCM system. These are all repetitive operational tasks that can be automated and handled by a digital worker. 

All this stuff is boring and repetitive to humans, and it takes a lot of time. But to a digital assistant it is fun and can be done extremely quickly. And the “right” digital assistant, with the proper skillset and training, can do almost all the HR administrative tasks that an HR expert can do. Often better, as they don’t forget obscure details and business rules, they don’t make mistakes, and they bring their “A” game every single second of the day. And, as stated before, they don’t leave your organization, turnover is zero, so wisdom is accumulated and not lost via natural attrition. 

So let’s get into the math of the ROI. Bloomberg Law’s 2018 HR Benchmarks Report states that HR departments have a median of 1.5 employees per 100 people in the workforce. At the time, this represented an all-time high as it had long been around 1.0 per 100. Both the Society for Human Resource Management (SHRM) and Bloomberg numbers were very similar, so this number is considered very accurate.

SHRM also noted a clear reduction in the ratio based on organization size (an economy of scale). However, as the size of the company rises, so does the average compensation to HR staff (which explains why published averages are very misleading). Working in HR for a large company can be twice as financially rewarding as for a very small company. The reason being complexity (on many levels). If you want HR people who understand complex organizations then you have to pay a premium. 

HR to Employee Ratio Graph

Figure 3: HR staff to employee ratio’s cross-industry

Using our example of an average organization with 20,000 employees and ratio of 0.4 HR staff for every 100 employees, the HR staffing level would be around 80. At this size of an organization the average level of HR compensation would be around $100,000. Making the total spend equal to $8,000,000 per year. Note: that’s a lot more than the $1,125,000 spent on the service desk salaries. 

In the world of HR staffing, turnover is more inline with other industries, around 15% per year. Though the cost of hiring HR staff is much higher than the $12,000 for service agents. For HR Staff it costs roughly $30,000 to replace the turnover (recruiting, interviewing, training, etc.).  So, in our example, 12 new staff are required every year at a turnover cost of $360,000. Making the total annual cost equal to $8,360,000.

The big question then is how much of this work can be taken over by a digital assistant? The answer isn’t quite as clear as with the service desk. It all depends on the skillset of the digital assistant, and HR taking a proactive approach to how it replaces natural attrition of HR staff. 

But the expectation, and based on the results of early projects, is that for the best digital assistants it is at least 10-30% of HR admin work that can be transferred from HR staff to the digital assistant. But that is just for 2020. This number should leap forward in bounds each year for the top digital assistant performers. 

Using a conservative approach, if a company decided to hire just 5% new HR staff each year instead of the usual 15%, and used the digital assistant to pick up the slack of the 10% of the positions left unfilled, the savings would still be considerable. Let’s examine the resulting cost savings and see how this looks in detail. 

In this scenario, HR costs would reduce in year one from $8,360,000 to $7,680,000 as the new staffing level would drop from 80 to 72 (12 people would leave and only 4 new people would be hired). While at the same time digital assistant operational costs would amount to roughly $360,000 to cover the slack. So, the total net saving in year one would be $680,000 (excluding implementation and configuration costs). But with a huge potential for much bigger savings in future years. 

Implementation and configuration costs for a digital assistant that could handle both help desk and HR admin tasks would likely cost in the realm of $100,000 to $250,000 to implement. But this would be a one-time fee and would result in a year one total saving of HR staffing costs between $800,000 to $900,000. 

Year two would see greater savings, as there would be no implementation costs, and the new hire rate would again be set at (a maximum) 5%. Taking the HR staffing level from 72 to 65. 

Year two HR staffing costs would therefore be $6,500,000. With a turnover cost of $90,000 (10 people would leave and only 3 new people would be hired), giving a grand total of $6,590,000. Because the digital assistant would be taking on more work in year 2, that cost would rise to $396,000. Which would result in a total cost in year 2 of $6,986,000 (see graph below for details). 

HR professional cost infographic

Figure 4: HR Staffing ROI using a Digital Assistant

In summary, correct implementation of a digital assistant solution that can handle both HR help desk AND HR admin requests is by far the best approach to achieve maximum ROI.  Done effectively it will also realize superior service levels by providing faster and more accurate turnarounds for your entire workforce in a way that is far more convenient for them. 

Welcome to the world of high ROI, and welcome to the next industrial revolution. It’s ready and available now. 

Contact Us

In January 2011, IBM unveiled on the TV show Jeopardy what they claimed to be the ultimate FAQ chatbot – Watson. Unfortunately, Watson proved to be “all hat and no cattle” and was never able to translate game show success into practical Enterprise AI success. Meanwhile the world has changed a lot, and AI has made many advances since those early days. 

As is often the case with any new technology, the things that appear to be amazing in the early stages of innovation quickly become basic features as the technology matures and real business world problems are tackled and solved.

Today, a chatbot answering basic questions is considered a bare minimum requirement when considering what a chatbot needs to be capable of to be able to perform the jobs of actual humans. 

We now use the term “Digital Assistant” or “Enterprise Assistant” to describe a chatbot that has many more skills than just being able to answer simple questions. Though often, the first time many organizations try out a chatbot solution, it’s by piloting what they believe is the easy option: an FAQ chatbot. 

However, not all FAQ chatbot skills are created equal. In the AI world of FAQ capabilities there is a huge variance between different vendor solutions. 

Think of it this way. Most people can sing, but most people aren’t great singers. In the same way, most chatbots have basic FAQ skills, but very few chatbots have great FAQ skills.

Freddie Mercury vs. someone
Figure 1: Both of them can sing, but one is a lot better than the other.

So, to cast much needed light upon this subject, we’ve created an FAQ about FAQ chatbots that should help explain the difference. 

Q: Can I add as many questions as I want to an FAQ chatbot, and it’ll be able to answer all of them accurately once I’ve conducted supervised training?

A: For most FAQ chatbots the answer is no! Many of them start to suffer the dreaded “intent mismatch” issue at around 100 questions. Only Chatbots properly architected can handle thousands of questions accurately. 

Q: What’s an “intent mismatch” issue?

A: This is when you ask a chatbot a question and it matches to the wrong question, and therefore gives you the wrong answer. This is the worst thing that can happen in the chatbot world, and will destroy confidence of it in your organization. 

Q: What causes intent mismatching?

A: Oftentimes it’s poor training that’s the culprit, and that can be easily fixed. But there are scalability issues that tend to kick in around 100 questions (though it can happen at a lot less than that), whereby the chatbot starts to get more and more confused as to what it thinks the human is asking it. 

Q: Why is there more likelihood of intent mismatch issues once I get close to 100 questions?

A: As the number of intents for a chatbot increases, the chance of some intents (questions) looking similar to other intents also increases. This is a scalability issue. If the FAQ chatbot is not architected properly it will suffer hugely from scalability issues, and will be unable to handle lots of questions that sound (in the mind of the chatbot) very similar. 

Q: What do “good” FAQ chatbots do that allows them to solve the intent mismatch issue?

A: The good ones have multiple ways of understanding what the human is asking. They don’t just rely on simple NLP (Natural Language Processing) training, and are able to also factor in things like subject recognition, entity existence, and knowledge of your organization’s vocabulary. The reason this is a far superior means of intent matching is because this is how actual humans think. We don’t just use one indicator to understand what someone is saying, we deduce understanding from multiple elements and inferences of a sentence. And that’s how a really smart FAQ chatbot does it too, and how it’s able to handle thousands of questions and match them perfectly. 

Q: What happens when the question is ambiguous because the human wasn’t completely clear on what they wanted?

A: This all depends on the chatbot. Some chatbots just cross their fingers, make a guess, and hope for the best. Some recognize ambiguity based on confidence level analysis (which isn’t always accurate either). While the very best have smart algorithms for dealing with ambiguity and will ask clarifying questions to make sure they understand the “intent” of the question. 

Q: Does this mean that a good FAQ Chatbot is more complicated to manage than a bad one? Given how much more it is capable of doing?

A: No, quite the opposite. Because it’s massively more capable it makes it much easier to manage. Think of it this way, training something that already has lots of skills is much easier than training something that has very basic skills.  

Q: Can FAQ chatbots handle the fact that though the question may be the same, the answer can vary due to location/job/department differences of the person asking the question? For example, the question may be, “what is the sick leave policy”. And depending on who is asking, the answer is often very different.

A: Like the mismatch question, the answer varies based on good chatbots vs bad chatbots. The bad ones only support basic 1-to-1 mappings. One question always equals one answer. In the Enterprise world this doesn’t work at all. So, the good chatbots are capable of understanding demographic information about the person asking the question and can tailor the answer based on that. 

Q: My chatbot vendor said I need to load all my “answers” into their chatbot in the Cloud. Is this a good idea? 

A: No, this is a terrible idea. Loading all your content into someone else’s environment is not only technically unnecessary, it’s also forcing you into dual maintenance of two sources of truth. A good chatbot needs to be able to plug into your many sources of content to provide the answer

Q: But what if the answer is too long to show in a conversation? My chatbot vendor is telling me that I need to manually create abbreviated versions of all my unstructured content. 

A: Best practice UX (user experience) is that the chatbot does provide summarized responses (with options to see the full answer) to make the conversation easy to understand by the human. However, good chatbots can use AI to auto-summarize the text, and this would be the recommended approach. 

Q: Can FAQ chatbots only answer a question with static (ex: text, HTML, or web links) information, or can they also include data too? 

A: Basic FAQ chatbots are limited to only being able to respond with static data, but the good ones can also include data from other systems. And the great ones can also bring back that data from both on-premise and multiple Cloud systems. 

Q: It sounds like there’s a massive difference between FAQ chatbots and it’s important to look “under the hood” before I make a decision?

A: Yes, if you can take the time to test-drive a $20,000 car, then you should definitely test-drive any chatbot before making a decision. 

If you’d like to see a great chatbot in action, please contact us for a live demonstration. 

Contact Us

AI

AI