Introduction
In many multiplayer games, communication is a key aspect. Whether it’s for strategizing or making new friends, communication brings players together.
When designing communications, it’s important to consider the wide audience using them. Features for accessible player-to-player communication include Menu Narration, Chat Narration, Speech-to-Text, and Text-to-Speech. These features help players who might not be able to see text, hear audio, or speak, enabling them to take part in voice and text chat. This isn’t just good practice, it’s required by laws like the 21st Century Communications and Video Accessibility Act (CVAA). The CVAA mandates communications, including in-game chatting, be accessible to people with disabilities.
What is CVAA?
Enacted in 2010, Title I of the 21st Century Communications and Video Accessibility Act (CVAA) requires that “advanced” communications be accessible to people with disabilities. This includes voice chatting, texting, e-mail, instant messaging, and video communications. Title II relates to televised content. This article focuses on how Title I requirements apply to games.
The CVAA makes sure that accessibility laws enacted in the 1980s and 1990s are brought up to date with 21st century technologies, including new digital, broadband, and mobile innovations.
– U.S. FCC, CVAA Consumer Guide
With the Federal Communications Commission (FCC) in collaboration with the Entertainment Software Association (ESA), the games industry had a series of waivers postponing CVAA compliance. The last waiver expired on Dec 31st, 2018. Games developed after this date, or those with large post-launch updates, must comply.
CVAA Compliance
Games featuring player-to-player communications (text, voice, or even rare video chatting) must comply with CVAA. If your game does not feature these communications, CVAA does not apply. Compliance with the CVAA is summarized in three parts:
- Criteria: ‘Performance Objectives’ for your game to achieve. The performance objectives require that software is operable without abilities such as vision, color perception, hearing, dexterity, and speech.
- Recordkeeping: Proof that accessibility was considered early in development, including feedback from players with disabilities.
- Reporting: Companies must register with the FCC. Every year organizations provide a representative’s contact info and confirm active recordkeeping. If a complaint is filed, these records are requested.
The law defines goals, not implementation details. Developers must determine which features meet those goals. In this article, we’ll explore Menu Narration, Chat Narration, Speech-to-text, and Text-to-speech. These features enhance accessible communications.
Note: Having these features does not provide full compliance with CVAA criteria. Not only must communications be accessible, but all screens required to set-up and navigate to communicate must as well. There are many other aspects to accessible communication such as input, text size, and contrast.
Text Chat
Text chat is the most common way for players to communicate with each other in games. Before networks could handle voice chatting, players sent each other typed messages. Text chat became popular in Massively Online Multiplayer (MMO) games since the mid-90s. For most, text chat is a visual feature, reading messages from friends. This is where Narration comes in, allowing players to hear text without relying on sight.
Menu Narration
In this article, ‘Narration’ refers to screen-reading text aloud, while ‘Text-to-speech’ refers to a player’s typed message being spoken aloud in voice chat (which we’ll cover in the next section). Narration is often called Text-to-speech as it uses the same technology, converting written text into a spoken voice. In games, this feature is also called: Narrator, UI Narration, Screen Reader (the primary web usage) or variations of Read to Me.
Menu Narration reads aloud the game’s interactable elements so that players don’t need to see it. Players use this feature by navigating the interface with a keyboard or controller. Without vision, the precise positioning required for cursor selection is a barrier. Since the route to setting up communications must be accessible under CVAA, screens like initial setup, title menus, lobbies, and settings menus should include narration.
Chat Narration
Chat Narration is as the name suggests, narration only for chat messages. The game narrates incoming text chat in real time, reading messages out loud to the player. This provides the player a method to receive text chat without relying on vision. A game might let you turn on narration for the entire interface, and chat narration separately. Offering choice is ideal for players that may only want messages read aloud or vice versa.

Accessible Games Initiative
Menu Narration aligns with the Accessible Games Initiative (AGI) tag Narrated Menus.
Chat Narration is featured in the AGI tag Chat Speech-to-Text & Text-to-Speech. As noted earlier, narration is often referred to as ‘text-to-speech’, describing the underlying technology. AGI defines their usage of Text-to-speech as the narration of chat:
If the game lets players communicate with each other using text, support text-to-speech so that players can hear the conversation narrated in real-time.
Voice Chat
Voice chatting became prevalent in the late-90s with online games for the Sega Dreamcast. Voice chat uses Voice over Internet Protocol (VoIP) to send voices from one IP address to another. For many, voice chatting is the fastest way for players to communicate with each other during high-action gameplay. But what if your players cannot use their microphone? Or can’t listen to game audio? This could exclude them from the conversation. This is where Speech-to-Text and Text-to-Speech come in, bridging communication gaps.
Speech-to-Text
Speech-to-Text (STT) allows players to participate in voice chat without relying on hearing. It converts spoken words into written text. This feature is valuable for players who are deaf or hard-of-hearing, as well as those who prefer reading over listening. If players are talking over microphones, STT provides a way for those playing without audio to know what’s said.

Bonus: Voice Dictation for Outgoing Messages is another form of speech-to-text. It allows players to input text messages using their voice. Dictation is beneficial for players who struggle with typing or controller input.
Text-to-Speech
Text-to-Speech (TTS) provides participation in voice chat that doesn’t rely on speaking. When players are unable to use their microphone, it can be for a variety of reasons. It could be due to hardware malfunction, or consideration for other people in the room. With TTS, the player types a message, and the game converts text into spoken words. The message is broadcasted to other players in the voice chat. The player has assistive technology to speak for them.

Accessible Games Initiative
Both voice chatting features are in the Accessible Games Initiative tag Chat Speech-to-Text & Text-to-Speech. In the tag’s Developer-Facing Requirements, Text-to-speech is “Outgoing text-to-speech.” This distinguishes it from Chat Narration.
If the game lets players communicate with each other using voice, support these options:
- Speech-to-text so that players can read a text transcript in real-time.
- Outgoing text-to-speech so that players can send a message to the voice chat.
Communication Considerations
- Speaker Identification: In group chats, many people might be chatting. Narration and Text-to-Speech should identify who said what. If your game features text and voice chat, distinguish how the message was sent.
- Channel Identification: If a game has separate chat channels (whisper, team, etc.) communication features should include that information. For example, “Global Chat, Player 2, Good Game” informs the player of the intended audience.
- Voice Profiles: A range of voices allows players to customize narration and text-to-speech. Players should be able to choose voices with various tones (such as masculine, feminine, accented, or robotic) to suit their preferences and identity. For narration, it adjusts the voice screen reading. For text-to-speech, it adjusts how the player’s voice sounds to others. Customization also helps identify voices when more than one player uses text-to-speech.
- Narration Customization: There are other ways to enhance narration. For advanced users, they may want speed options up to 400%. Another helpful adjustment is verbosity, or the amount of detail included in narration.
- New players may prefer slower voices that sound more natural and human-like. Advanced players may prefer synthetic voices over natural ones. Synthetic voices are easier to “speed read” with, as the spoken consonants and vowels remain pronounced at faster settings.
- Platform Handling: For CVAA, you don’t have to implement communication features from scratch. Instead you can leverage platform screen readers or backend API such as Azure Playfab.
- If platform screen readers are leveraged, provide players information on which are supported. It’s also helpful to include the setting in-game, for increased player awareness.
- High-Volume Chats: In a busy lobby or large multiplayer game, chat could scroll quickly. The chat’s narrator might finish one message and immediately start the next. Narration should not overlap messages. Include options to pause or stop chat narration. This can be accomplished by minimizing chat or changing focus.
- Consider including digital navigation within chats. This allows players to review the chat log at their own pace by highlighting a message at a time.
- Quick Chat: Quick chatting, or sending pre-selected phrases to text chat, are a form of messaging. These messages must be narrated for CVAA.
- Emojis and Emoticons: If emojis and emoticons can be sent to text chat, they should be included in narration.
- Emotes: For CVAA, emote communications (picture, animation, or gif-based) are exempt from compliance as they do not fall under voice, text, or video chatting. Consider providing Alt Text or descriptive labels for emotes, so players using narration have context.
- Accuracy and Errors: No speech recognition system is perfect. Background noise, accents, fast speech, or uncommon names can lead to mistranslations.
- Delay: Converting text into speech and vice versa isn’t immediate. Most games achieve near-real-time, but developers should minimize delay as much as possible.
- Language: If your game supports multiple languages, provide a list of supported languages for your communication features.
- Profanity: For games with a mature audience, profanity filtering should be optional. If profanity filtering is off, communication features should be consistent with it. Also consider how the profanities are censored, multiple asterisks might not narrate well.
Conclusion
The rise of text and voice chatting has shown the importance of socializing online. Recent advances in technology have provided ways we can bridge communication gaps.
Menu Narration, Chat Narration, Speech-to-Text, and Text-to-Speech ensure more people can participate in the conversations happening in your game.
If you need a partner in assessing your game’s communications, Accessibility Labs is here for you. Please reach out to us, we’re a message away.
Additional Resources
- IGDA Game Accessibility SIG: Demystifying CVAA
- FCC: 21st Century Communications and Video Accessibility Act (CVAA) Consumer Guide
- FCC: Accessibility of Communications in Video Games
- Xbox Accessibility Guidelines: Speech-to-text/Text-to-speech Chat
- Xbox Accessibility Guidelines: Communication Experiences
- Microsoft Learn: What is PlayFab?
- Microsoft Learn: PlayFab Party text-to-speech and text input UX guidelines
- Microsoft Learn: PlayFab Party speech-to-text and text display UX guidelines
- Game Accessibility Guidelines: Realtime Text Speech Transcription
