/ Library/ What Should We Remember/ Modality Effect
Recall Bias № 124 · Last updated 6 June 2026

Modality Effect.

"They heard it once and forgot; they saw it twice and still forgot — channel choice is memory choice."

01Overview

The modality effect is the finding that recall differs depending on whether information was presented auditorily or visually — and under what mixing conditions. Visual lists interfere with visual memory; spoken information can sometimes be recalled better when visual memory is overloaded. For multimodal products, channel is not redundant decoration — it is encoding strategy.

Designers ship video tutorials, voice assistants, readable docs, and silent UI in parallel. Modality effect asks which channel carries the critical memory — and whether combining channels helps or creates interference. Accessibility requirements multiply modalities; without intentional design, users remember the wrong channel's gist or nothing at all.

02Detailed explanation

Modality shows up in product memory trade-offs:

  • Voice assistant instructions users cannot recall when silent UI needed — auditory memory not transferable.
  • Video onboarding with narration plus on-screen text — users remember neither fully due to split attention.
  • Alert tones without visual backup — users remember something happened, not what to do.
  • Screen reader users encode speech output; visual redesign breaks memory tied to auditory structure.

Effective multimodal design assigns complementary roles — visual for structure, audio for urgency, text for precision — rather than duplicating content in competing channels. Redundancy is not always reinforcement; it can be interference.

03Why it exists

Separate processing resources for auditory and visual streams allow one to offload when other is saturated — until both compete for same semantic goal.

Products default to "add a video" without testing whether target memory is procedural (visual-motor), verbal (auditory), or symbolic (text).

The short version

Choose the channel that matches what users must remember and the context they'll need it in.

04Effects on users

Users remember podcast-style help in commute context; forget in-app tooltips seen while distracted — modality plus context.

Deaf and hard-of-hearing users cannot rely on auditory modality; visual/text encoding must carry full load — not supplementary.

05Effects on designers & teams

Teams duplicate instead of design modality:

  • Video-only critical instructions. No persistent text alternative — memory trapped in viewed once.
  • Sound effects as sole confirmation. Users remember beep, not outcome.
  • Competing narration and captions. Split attention reduces both encodings.
  • Ignoring screen reader speech patterns. Visual redesign breaks auditory landmarks.

06Practical takeaways

  • Assign one primary channel per critical fact. Secondary supports, not competes.
  • Provide persistent text for must-remember steps. Audio and video decay fast.
  • Test recall in target context. Silent office vs mobile walking.
  • Design accessible parity across modalities. Full message in each needed channel.
  • Use audio for urgency, text for precision. Complementary roles.
  • Avoid redundant simultaneous streams. Sequential reinforcement beats parallel overload.

07Design examples

Voice UI

Asked, not remembered

Users get recipe steps via smart speaker. At stove, cannot recall step three — auditory memory without visual anchor. Text send feature underused.

Onboarding

Watch and forget

Narrated video tour scores completion. Week-later task test fails — dual modality split attention; neither channel encoded deeply.

Alerts

Beep without meaning

Error chime distinct; users remember "something wrong" not which field — auditory salience without semantic pairing.

Accessibility

Landmarks in speech

Screen reader users navigate by spoken heading structure. Visual rebrand changes layout but not headings — memory survives; rebrand without heading stability breaks recall.

08Ethical risks

Critical safety and legal information in single ephemeral modality (audio-only flash) excludes and endangers users who cannot encode or replay it.

Modality choices that favour one able group encode inequality — memory architecture is accessibility architecture.

Self-test: What must users remember after the modal closes — and is it encoded in a modality they can access and replay?

10Suggested reading