This talk explores how vocal synthesis controlled in real-time by hand gestures (chironomy) can be used by non-native speakers for intonation practice of a foreign language. Such practice addresses three sources of difficulty for intonation learning. First, it can train the ear to perceive unfamiliar features in speech by presenting them through visual and kinesthetic modalities. Second, the control of pronunciation with hand gestures bypasses ingrained patterns in the natural voice that are difficult to correct. Finally, vocal synthesis enables a learner to focus on the suprasegmental level without being preoccupied with fine-phonetic detail on the segmental level. I present findings from two experiments: a pilot with non-native speakers of French and a study with francophone learners of English, discussing lessons learned and perspectives for future directions.