Why Ethics Is Priority One for Making Voice Assistants Work in the Enterprise

By: Kathy Baxter

Published: November 19, 2019

Voice technology enables machines to understand natural language commands to complete tasks and can generate natural language responses in return. It is the next wave of AI innovation and it’s quickly gaining momentum, which means we’re at a critical juncture of getting it right.

As consumers, we’re all familiar with asking voice assistants built into our smartphones and smart speakers for the weather, directions, and even a joke. Bringing voice technology into an enterprise setting can be quite different and introduce another level of complexity and severity when it comes to privacy and security. The stakes are higher.

This blog will discuss the ethical implications of voice for business and how to make them an operational and strategic priority now—before you’re too far down the path.

Voice is transforming business

Voice technology is moving out of our consumer lives and into business, transforming the way people work and reimagining customer experiences. For example, sales reps can use voice to quickly and conversationally update call notes from a customer meeting directly into Salesforce, without having to set aside time for manual data entry. And a field technician can verbally and instantly check on a service history while en route to her next customer.  

And now, with new voice intelligence capabilities for sales and service, organizations can gain valuable insights from voice data to optimize customer calls. Using automatic speech recognition (ASR) and natural language processing (NLP) to surface insights from conversational data, Einstein Call Coaching helps sales managers coach reps to make the most out of every call, whether it’s handling objections or upselling a new product.

Unique ethical challenges of voice in enterprise

With new use cases come new challenges specific to the enterprise.

Customers must trust that AI-powered voice platforms are going to do the right thing with their data, and not share or misuse it; that customer IP will be kept secure, and that biased or inaccurate predictions are not perpetuated.

6 ethical considerations to prioritize—starting now

If you are building voice capabilities into your business applications, you have a moral responsibility to understand—and avoid—the possible negative, unintended effects that technology can have.

Here’s what you need to know to build and implement solutions in a responsible, ethical way.

#1 Consent

Customers should have control over how their voice data is being used.

“Two-party consent laws” in 11 states require all parties on a call or in a conversation to give permission to be recorded. Even if this level of consent is not legally required in your state, from an ethical perspective, it’s critical that people know they are going to be recorded and why.

Recommendation: Ensure you or your telephony system are capable of capturing consent and that the feature is enabled.

#2 Accuracy and protection of company IP

In consumer voice assistants, everyone’s data is combined into training data for one global model. That type of model is great at predicting questions most people have and suggesting answers that work for most people. However, in an enterprise system, technical jargon, unique product names, accounts, and other nuances make it difficult for generic automated speech recognition (ASR) systems to get it right. Voice services in enterprise environment needs to be trained to recognize a company’s unique terminology to be useful.

In addition, when you’re combining sensitive business data from multiple companies, you’re at risk of the voice assistant suggesting information that is a company’s IP: account names, mergers, financial data. For this reason, the data must be siloed.

At Salesforce, in our multi-tenant system, our customer data is siloed to protect their IP.

Recommendation: If you have an enterprise platform, keep customer data siloed. Allow customers to create a dictionary of their company’s technical jargon and other terms that are unique to them to increase accuracy.

#3 Customer privacy

During customer calls, sensitive information—like addresses, credit card numbers, social security numbers, and much more—is shared. In order to ensure only those with appropriate authorization can access this data, it needs to be encrypted and be sure there’s an automated mechanism to identify and strip out any personally identifiable information (PII).

Recommendation: Follow security best practices by encrypting and protecting customer PII.

#4 Storage

Some consumer voice assistants allow you to view and delete queries in the last 24 hours or further back. This is one step that helps develop trust. When using voice in an enterprise setting, that same level of control is needed.

Certain regulations like GDPR (General Data Protection Regulation) and California Consumer Privacy Act (CCPA) restrict how long PII —including voice recordings and transcripts—can be stored or require it to be deleted at the consumer’s request. Be aware of the legal requirements where your customers live as requirements vary by country and state.

Recommendation: Give customers control by allowing them to see what has been recorded and delete it as needed. Check to ensure you or your telephony service is capable of providing this and that it is enabled. Store data only as long as business requires or law allows, whichever comes first.

#5 Access

Which languages a company chooses to support determines who gets access to the service. In addition, voice recognition is still not as accurate for women or people with non-US Midwestern accents, which means these individuals have less success using systems requiring voice input.

If you’re launching only in the US and focused solely on native English-speaking users, there are eight major American English dialects: Canada, Northern New England, The North, Greater New York, The Midland, The South, North Central, The West. Of course, there are many English speakers in the US who are not originally from this country but who you should also be able to support (e.g. English with a French accent or Japanese accent). And, it is important to recognize voice differences by gender (e.g., pitch, frequency). Training our voice technology on a variety of representative voice data sets is an important goal at Salesforce.

Recommendation: Train your voice assistant on a representative data set of many different dialects and genders. And constantly evaluate the AI to measure the accuracy of performance across different dialects and genders.

#6 Productivity

Unlike in consumer voice assistants, telling jokes, making witty remarks, and being entertaining are not a top priority for enterprise environments. The priority is typically on productivity. That usually calls for voice responses that are brief but professional. However, it is important to understand what your customers want and to design the persona of your voice AI accordingly.

Recommendation: Identify the goal your end user wants to achieve with the voice AI and design the persona to match (e.g., formality of the language used, offering suggestions versus strictly doing what the user has asked it to do).

Getting voice right means getting ethics right

Salesforce is committed to data privacy, enterprise-level security for every user, and transparency in how we use voice data. And we’ve prioritized these six ethical considerations in building our voice products.

“We know that technology is not inherently good or bad. It’s what we do with it that matters.” - Marc Benioff

Whether you’re already building out new voice AI solutions, or planning to introduce them to your business in the future, start with the ethical implications and you’ll start on the right foot.