Since we published CIOL AI Voices in 2023, CIOL Council met again, and underlined their view of the importance of enabling linguists to keep abreast of developments in AI with a regularly refreshed resource hub. Here are some useful recent resources for linguists, which we will add to and update periodically, alongside future CIOL Voices pieces.
Significant Recent News & Articles for Linguists
This article discusses how the quality of data used to train large language models (LLMs) affects their performance in different languages, especially those with less original high-quality content on the web. It reports on a recent paper by researchers from Amazon and UC Santa Barbara, who found that a lot of the data for less well-resourced languages was machine-translated by older AIs, resulting in lower-quality output and more errors. The article also explores the implications of this finding for the future of generative AI and the challenges of ensuring data quality and diversity.
This article explores the transformative potential of generative AI in the translation industry. It illustrates how translators may be able to enhance their work quality and efficiency using generative AI tools (notably the OpenAI Translator plugin with Trados Studio) and the importance of 'prompt' design in achieving desired outputs. The article emphasises, however, that generative AI will augment rather than replace human translators by automating routine tasks, and encourages translators to adapt to and adopt AI as these tools herald new opportunities in the field.
The article discusses a study on the use of large language models (LLMs), specifically GPT-4 and GPT-3.5-turbo, for post-editing machine translations. The study assessed the quality of post-edited translations across various language pairs and found that GPT-4 effectively improves translation quality and can apply knowledge-based or culture-specific customizations. However, it also noted that GPT-4 can produce hallucinated edits, necessitating caution and verification. The study suggests that LLM-based post-editing could enhance machine-generated translations' reliability and interpretability, but also poses challenges in fidelity and accuracy.
This article from The Conversation analyzes the three main models of AI regulation that are emerging in the world: Europe’s comprehensive and human-centric approach, China’s tightly targeted and pragmatic approach, and America’s dramatic and innovation-driven approach. It also examines the potential benefits and drawbacks of each model, and the implications for global cooperation and competition on AI.
Amazon researchers have discovered that a significant portion of web content is machine translated, often poorly and with bias. In their study, they created a large corpus of sentences in 90 languages to analyze their characteristics. They found that multi-way translations are generally of lower quality and differ from 2-way translations, suggesting a higher prevalence of machine translation. The researchers warn about the potential pitfalls of using low-quality, machine-translated web-scraped content for training Large Language Models (LLMs).
Microsoft New Future of Work Report 2023
This new report from Microsoft is about the impact of large language models (LLMs) on the future of work, especially in the domains of information work, critical thinking, human-AI collaboration, complex and creative tasks, team collaboration and communication, knowledge management, and social implications. It synthesizes recent research from Microsoft and other sources to provide insights and recommendations for how to leverage LLMs to create a new and better future of work with AI.
The report covers the productivity and quality effects of LLMs, the challenges and opportunities of 'prompting' and interacting with LLMs, the potential of LLMs to augment and provoke critical thinking, the design principles and frameworks for effective human-AI collaboration, the domain-specific applications and implications of LLMs in software engineering, medicine, social science, and education, the ways LLMs can support team collaboration and communication, the impact of LLMs on knowledge management and organizational changes, and the ethical and societal issues raised by LLMs. The report also provides examples of how LLMs are being used and developed at Microsoft and elsewhere.
It also flags (on p36) the important concept of an increased risk of “moral crumple zones". It points out that studies of past 'automations' teach us that when new technologies are poorly integrated within work/organisational arrangements, workers can unfairly take the blame when a crisis or disaster unfolds. This can occur when automated systems only hand over to humans at the worst possible moments, when it is very difficult to either spot, fix or correct the problem before it is too late.
This could be compounded by 'monitoring and takeover challenges' (set out on p35) where jobs might increasingly require individuals to oversee what intelligent systems are doing and intervene when needed. However studies reveal potential challenges. Monitoring requires vigilance, but people struggle to maintain attention on monitoring tasks for more than half an hour, even if they are highly motivated.
These will likely be challenges linguists may face, alongside the many new possibilities and opportunities that this report calls out.
UK Government's Generative AI Framework: A guide for using generative AI in government
This document provides a practical framework for civil servants who want to use generative AI. It covers the potential benefits, limitations and risks of generative AI, as well as the technical, ethical and legal considerations involved in building and deploying generative AI solutions in a government context.
The framework is divided into three main sections: Understanding generative AI, which explains what generative AI is, how it works and what it can and cannot do; Building generative AI solutions, which outlines the practical steps and best practices for developing and implementing generative AI projects; and Using generative AI safely and responsibly, which covers the key issues of security, data protection, privacy, ethics, regulation and governance that need to be addressed when using generative AI.
It sets out ten principles that should guide the safe, responsible and effective use of generative AI in government and public sector organisations.
- You should know what generative AI is and what its limitations are
- You should use generative AI lawfully, ethically and responsibly
- You should know how to keep generative AI tools secure
- You should have meaningful human control at the right stage
- You should understand how to manage the full generative AI lifecycle
- You should use the right tool for the job
- You should be open and collaborative
- You should work with commercial colleagues from the start
- You should have the skills and expertise needed to build and use generative AI
- You should use these principles alongside your organisation’s policies and have the right assurance in place
The framework also provides links to relevant resources, tools and support as well as a set of posters with the key messages boldly set out, as below:
A transparent framework is clearly to be welcomed. From the perspecitive of linguists working with UK government, these principles also give a useful framework for accountabilty and the means to ask reasonable questions about policies and practice that may affect their work.
Read the 'CIOL AI Voices' White Paper
Click the image or download the PDF here.