Toward text-to-picture synthesis
A. B. Goldberg, J. Rosin, X. Zhu and C. R. Dyer,
Proc. NIPS 2009 Symposium on Assistive Machine Learning for People with Disabilities, 2009.
It is estimated that more that 2 million people in the United States have significant communication impairments that result in them relying on methods other than natural speech alone for communication. One type of commonly used augmentative and alternative communication system is pictorial communication software such as SymWriter, which uses a lookup table to transliterate each word (or common phrase) in a sentence into an icon. This is an example of converting information between modalities. However, the resulting sequence of icons can be difficult to understand. We have been developing general-purpose Text-to-Picture (TTP) synthesis algorithms to improve understandability using machine learning techniques. Our goal is to help users with special needs, such as the elderly or those with disabilities, to rapidly browse documents through pictorial summaries. Our TTP system targets general English. This differs from other pictorial conversion systems that require hand-crafted narrative descriptions of a scene, 3D models, or special domains. Instead, we use a concatenative or collage approach and present how machine learning enables the key components of our TTP system.