Working Group Report: Open Linguistic Data and Localisation

This is about proposal writing, linking OLD to ANLoc

Question If we had LD – how would that help us?

Link to ANLoc vision?
• Remove limitations for disabled people
o Speech might be language specific; colours, font size is not

Link to ANLoc (what we can do, not necessarily linked to the vision)
• Spellchecker project had need for OLD
o Results limited by availability of LD
o
• There are other activities doing this, investing heavily (because they have good reasons to do so), cross-reference
o TAUS Data Association
o ELRA/ELDA
o Linguistic Data Association (LDC)
• It’s necessary for
o Fonts
o Spellcheckers
o Thesauri

Link to Localisation
• Needed for
o Translation Memory ™
o MT
o Terminology

Which problems to solve
o Predictive text input (mobile phones)
o Translate English laws into local languages
o Person calls, uses voice prompts to respond (though this does not scale)
o See “Freedom Phone”

Questions
• What is a reasonable amount of data to collect per language
• Needs to be open (to stimulate and enable activity)
• Needs investment
• Needs prioritization (which languages, domains; amount; tagging)
• Don’t forget speech; why not use radio programmes?

Observation
• People in Africa do not call up voicemail
• NEED a SUCCESS STORY!