Home  /   Data Sources  /   Rehabilitation Dataset Directory: Search  /   Repository Profile Center for Large Data Research & Data Sharing in Rehabilitation

Rehabilitation Dataset Directory: Repository Profile

Repository: AphasiaBank ()

Basic Information
Repository Full Name AphasiaBank
Repository Acronym
General Theme or Special Population

Aphasia, Communication impairment, Neurological conditions, Stroke

AphasiaBank is a shared database of multimedia interactions for the study of communication in aphasia. Funded by the National Institute for Health (NIH) National Institute on Deafness and Other Communication Disorders (NIDCD). It provides researchers access to a large, shared database that can facilitate hypothesis testing and increase methodological replicability, precision, and transparency

It is a database of interviews between aphasic participants and clinicians collected from multiple sites using a consistent protocol format. The video recordings are transcribed in the CHAT format (MacWhinney, 2000) and each utterance is linked to the corresponding segment in the video recordings. The linked transcripts are made available to AphasiaBank members for further analysis and playback over the web. 
The repository also provides access to related resources including computer programs, manuals, transcription training, research protocols, IRB guidelines, and ground rules.

The access to transcripts and video materials is password-restricted to AphasiaBank members, but membership is automatically granted to researchers studying aphasia on request. Access to the related programs, manuals, and other resources are open and free to everyone.
Key Terms Aphasia, Stroke, Communication, Language, Speech disorder, Multimedia, Video, Transcripts 
Sponsoring Agency/Entity

National Institutes of Health (NIH);

National Institute on Deafness and Other Communication Disorders (NIDCD)

Datasets Information
Data Type(s)

Clinical, Interviews


North America


United States

Strengths and Limitations
  • All data available to registered users
  • Data follows tightly specified collection protocol
  • Complete documentation is freely available
  • Wide variety of data available including video, transcripts, and part of speech tags
  • Basic, but functional, interface.
Data Repository Details
Primary Website


Repository Tools
  • Browse-able database
  • Protocols and IRB guidelines
  • Manuals
  • Links to linguistic analytic programs  
  • Links to related posters, presentations and publications
  • Tools for morphological tagging available in several languages including: English, Spanish, German, French, Italian, Japanese, Cantonese, and Mandarin. 
Data Submission Requirements


Data Submission Process

Researchers interested in submitting data must join the consortium and are expected to contribute corpora constructed with TalkBank programs and tools. It is the obligation of TalkBank and TalkBank users to assure that these contributions are properly acknowledged and cited and that the data are correctly stored and distributed.

Data Access


Data Access Requirements

Access to the data in AphasiaBank is password protected and restricted to members of the AphasiaBank consortium group. Membership is automatically granted to all researchers studying aphasia on request. 

Documentation & Resources

See main page for access to protocols, trainings, manuals, teaching videos and other materials:


Other Papers

AphasiaBank: Methods for Studying Discourse:


AphasiaBank as BigData


Related posters:


Related publications:


Ask Our Researchers

Have a question about disability data or datasets?
E-mail your question to our researchers at disabilitystatistics@cornell.edu

The Rehabilitation Research Cross-dataset Variable Catalog has been developed through the Center for Large Data Research & Data Sharing in Rehabilitation (CLDR). The Center for Large Data Research and Data Sharing in Rehabilitation involves a consortium of investigators from the University of Texas Medical Branch, Cornell University's Yang Tan Institute (YTI), and the University of Michigan. The CLDR is funded by NIH - National Institute of Child Health and Human Development, through the National Center for Medical Rehabilitation Research, the National Institute for Neurological Disorders and Stroke, and the National Institute of Biomedical Imaging and Bioengineering. (P2CHD065702).

Other CLDR supported resources and collaborative opportunities:

Acknowledgements: This tool was developed through the efforts of William Erickson and Arun Karpur, and web designers Jason Criss and Jeff Trondsen at Cornell University. Many thanks to graduate students Kyoung Jo Oh and Yeong Joon Yoon who developed much of the content used in this tool.

For questions or comments please contact disabilitystatistics@cornell.edu