How much does AI speed up human translation according to Smartling?

Olga Beregovaya, VP of AI at Smartling, sets the human baseline at roughly 2,000 words per day per translator. With a well-trained, well-prompted, fine-tuned model doing the heavy lifting and a human compensating for the delta AI can’t cover, she describes the productivity gain as “magnitude and order of magnitude.” She frames the combined result as lower costs, higher productivity, and much more predictable quality — though she stops short of citing a single fixed multiplier.

Why do English-centric AI models cause problems for translation?

Beregovaya explains that most foundational models remain “English language and English phenomena centric,” so even with correct vocabulary the model can describe American-culture-centric phenomena while sounding fluent in the target language, costing factual and anthropological accuracy. Models also react poorly to prompts written in languages other than English — she notes a carefully crafted English prompt often produces a better target-language result than a linguist’s prompt in their own native language. Sometimes neural machine translation outperforms an LLM outright.

How does RAG improve AI translation quality?

According to Beregovaya, translation runs on assets like translation memories (legacy translations as a parallel corpus/bitext), glossaries, do-not-translate lists, and style guides that can run 100 pages. Feeding all of that into a prompt is a “waste of fortune”; RAG instead retrieves only what’s relevant, by keyword or concept, from curated sources. She calls RAG a “huge” game-changer and “a great way of mitigating biases and hallucinations,” especially when combined with knowledge graphs for structured retrieval.

Are smaller purpose-built models better than large models for translation?

Beregovaya says smaller, purpose-built models currently deliver more predictable, more explainable outputs for specific languages and domains, citing research that a model handling more tasks degrades on each individual one — “jack of all trades, master of none.” But she hedges this as a “pause and see” moment, expecting larger foundational models to eventually catch up as they learn from field feedback and expand language coverage. She also flags a fine-tuning floor of roughly 100,000 strings.

When will AI erase the language barrier in translation?

Beregovaya is candid that it’s her opinion amid passionate industry debate, but says the trajectory toward human parity is climbing sharply — faster for romance-language families, slower for agglutinative or Finno-Ugric languages. She believes “that eventually is around the corner — it can be three years, it can be five years,” and that the “language barrier as we know it is going to be erased.” Until biases and hallucinations are solved and language coverage is even, linguists will still be needed to produce datasets and fact-check output.

Episodes · S2 E20 ← Prev Next →

Breaking the Language Barrier: Smartling's AI Translation Pipeline | Olga Beregovaya

Apr 23, 2025 · Olga Beregovaya , Smartling · 41 min

RAG & Retrieval AI Evaluation & Reliability Multimodal AI

Listen on any app

Key takeaways

Olga Beregovaya benchmarks human output at “a translator can deliver 2,000 words a day — that’s the metric,” and says a well-trained, well-prompted, fine-tuned model lifts that by “magnitude and order of magnitude.” She frames the payoff as a package: lower cost, higher productivity, and more predictable quality.
The bottleneck in language AI isn’t big-model capability — it’s the English-centric core. Beregovaya notes most foundational models remain “English language and English phenomena centric,” so a carefully crafted English prompt often “will render a better result in the target language” than a linguist’s prompt in their own native language.
Stuffing every linguistic asset into a prompt is, in Beregovaya’s words, a “waste of fortune.” Rather than feed translation memories, glossaries, and 100-page style guides into a 30,000-token prompt, Smartling uses RAG to “only fetch what’s relevant” — “a huge, huge, huge” game-changer and “a great way of mitigating biases and hallucinations.”
Smaller purpose-built models are beating generalists at translation now, but Beregovaya hedges it as a “pause and see what’s next” moment: she cites research that the more tasks a model handles, the more its performance on each individual one degrades — “jack of all trades, master of none” — yet she expects larger models to eventually catch up.
Data curation now matters more than data scale, Beregovaya argues: cleaner, smaller datasets beat “access to larger but potentially noisy” ones. She flags a hard floor — “you cannot fine-tune a model unless you have, like, your 100,000 strings” — and says the real struggle is harmonizing inconsistent bitext, glossary, and style-guide data.
Beregovaya’s sharpest agentic example is self-healing: a quality-estimation pass flags errors, then a smart agent says “heal everything that I’ve just identified.” Agents also route content and assign translators — but she warns teams to first ask: “does it call for an agent, or are we doing just fine with rule-based approaches?”

Frequently asked questions

How much does AI speed up human translation according to Smartling?: Olga Beregovaya, VP of AI at Smartling, sets the human baseline at roughly 2,000 words per day per translator. With a well-trained, well-prompted, fine-tuned model doing the heavy lifting and a human compensating for the delta AI can’t cover, she describes the productivity gain as “magnitude and order of magnitude.” She frames the combined result as lower costs, higher productivity, and much more predictable quality — though she stops short of citing a single fixed multiplier.
Why do English-centric AI models cause problems for translation?: Beregovaya explains that most foundational models remain “English language and English phenomena centric,” so even with correct vocabulary the model can describe American-culture-centric phenomena while sounding fluent in the target language, costing factual and anthropological accuracy. Models also react poorly to prompts written in languages other than English — she notes a carefully crafted English prompt often produces a better target-language result than a linguist’s prompt in their own native language. Sometimes neural machine translation outperforms an LLM outright.
How does RAG improve AI translation quality?: According to Beregovaya, translation runs on assets like translation memories (legacy translations as a parallel corpus/bitext), glossaries, do-not-translate lists, and style guides that can run 100 pages. Feeding all of that into a prompt is a “waste of fortune”; RAG instead retrieves only what’s relevant, by keyword or concept, from curated sources. She calls RAG a “huge” game-changer and “a great way of mitigating biases and hallucinations,” especially when combined with knowledge graphs for structured retrieval.
Are smaller purpose-built models better than large models for translation?: Beregovaya says smaller, purpose-built models currently deliver more predictable, more explainable outputs for specific languages and domains, citing research that a model handling more tasks degrades on each individual one — “jack of all trades, master of none.” But she hedges this as a “pause and see” moment, expecting larger foundational models to eventually catch up as they learn from field feedback and expand language coverage. She also flags a fine-tuning floor of roughly 100,000 strings.
When will AI erase the language barrier in translation?: Beregovaya is candid that it’s her opinion amid passionate industry debate, but says the trajectory toward human parity is climbing sharply — faster for romance-language families, slower for agglutinative or Finno-Ugric languages. She believes “that eventually is around the corner — it can be three years, it can be five years,” and that the “language barrier as we know it is going to be erased.” Until biases and hallucinations are solved and language coverage is even, linguists will still be needed to produce datasets and fact-check output.

Concepts in this episode

AI terms discussed here — each links to a plain-language definition.

Inference Accuracy AI Agent Latency Retrieval-Augmented Generation (RAG)Human in the Loop Tokenization Artificial General Intelligence (AGI)Explainability AI Hallucination

Chapters

00:00Introduction and Guest Welcome
01:14Evolution of NLP: From Rule-Based to Machine Learning
02:40Challenges in AI Translation
04:21Biases in Language Models
05:28Inference Time and Latency
05:44English-Centric AI Models
08:53Opportunities in AI Translation
09:14Industries Benefiting from Language AI
10:36Human-in-the-Loop Translation
12:06Architectural Innovations in Language AI
16:20Success with RAG Architectures
17:58Multilingual Vectorization
19:54Agentic AI in Translation
24:35Data Sets and Data Privacy
28:30Using Smaller, Purpose-Built Models
32:10Future of AI in Translation
36:37Conclusion and Farewell

Show notes

Are we on the verge of removing all language barriers with AI?

Olga Beregovaya, VP of AI at Smartling, joins host Conor Bronsdon to tackle this question, discussing the evolution from rule-based NLP to today's powerful LLMs. Together, they confront the persistent challenges that stand in the way, like the English-centric nature of AI, domain-specific inaccuracies, and the unpredictability of model hallucinations. Olga unpacks the difficulties faced when striving for accurate, nuanced translation across all languages, especially under-resourced ones.

Beyond these hurdles, the conversation explores the cutting-edge opportunities and technical innovations driving progress, including RAG, the rise of purpose-built models, agentic AI workflows, and the potential of multilingual multimodality. Olga shares insights into boosting translator productivity, achieving more predictable quality, and the path toward human parity in translation, examining how technology and human expertise will shape the future of global communication.

Connect with Chain of Thought host Conor Bronsdon:

Newsletter: https://newsletter.chainofthought.show/
Twitter/X: https://x.com/ConorBronsdon
LinkedIn: https://www.linkedin.com/in/conorbronsdon/
YouTube: https://www.youtube.com/@ConorBronsdon

Follow Today's Guest(s)

LinkedIn Olga Beregovaya

LinkedIn ⁠Smartling

Check out Galileo

⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Try Galileo⁠⁠

Transcript

108 segments

Olga Beregovaya 0:00 You could not even imagine the scale and the passion of the debate that's taking place in the translation industry at the moment. I think just looking at the metrics that we are able to capture as an industry and in my company, that trajectory towards human parity is it goes upwards like this. For some languages,

Conor Bronsdon 0:20 less. For some languages, more. But I think language barrier as we know it is going to be erased. Welcome back to the Chain of Thought podcast. I am your host, Conor Bronsdon, and I'm excited to introduce today's guest, Olga Birgabaya, vice president of AI at Smartling. Olga, welcome to the show. Hi, and thanks so much for having me. So let's jump right in. Olga, can you catch us up on what's been happening in the language AI space and how it's changed over the last couple of years?

Olga Beregovaya 0:47 Well, I think obviously the last couple of years, right, ever since that pay per attention is all you need came out in the world. The world will never be the same, and large language models are called language models for a reason, right, because they're built on natural language and all manifestations thereof, and we'll talk about it when we get to multimodality.

Olga Beregovaya 1:05 But, obviously, a lot of exciting things have been happening since the introduction of transformer models. What I want to start from start with is though when we talk about language AI, I think we should be fair to rule based natural language processing and NLP. Right? And, maybe talk a little bit about, okay, so we start from rule based NLP. Right? Then we go to statistical methods,

Olga Beregovaya 1:27 and then we get into the realm of actual machine learning that gets introduced into into into the world of natural language processing. There are a lot of instances where good old NLP, good old traditional rule based and statistical NLP is still working. Right? It gives us, like, we still have fantastic libraries and they definitely give us a lot of predictability.

Olga Beregovaya 1:46 So I think what really I mean, the most exciting things that have been happening is the opportunities that we now have at scale processing natural language, and developing applications and tooling for processing natural language at the lightning speed that we could not have before. I mean, I come from the world of syntax parsers and rule based machine translation,

Olga Beregovaya 2:08 and obviously took a lot of time to phase in one language. Right? If you look back to the days of AltaVista BabelFish. And what you can do now is exceptionally phenomenal, and I hope we get a chance to talk about challenges of individual languages. But at the end of the day, it's the pace, the accuracy, and, again, the scale at which we can process natural language, which I think is is absolutely truly phenomenal. Having said that, the Turing test is still a thing. Yes. It is.

Conor Bronsdon 2:36 Honestly, I think you gave us a great jumping off point here, which is challenges. I mean, we can talk about opportunities. I think a lot of conversations cover that. I'm sure we will. But I I'm curious what challenges you're seeing in this current era of translation and what what's coming next.

Olga Beregovaya 2:52 Okay. So, I mean, again, there are tons of opportunities, and we'll touch on them later. The challenges, if I were to rank them in order of appearance, that would probably be first one would be lexical and domain coverage. Like, that would be the first one. And, maybe we would focus on Gen AI, but please take me away from the world of Gen AI because I definitely don't wanna be stuck there. And if there are other things, other techniques you want to talk about, be delighted to.

Olga Beregovaya 3:20 But sometimes if you think about it, the models, the foundational generative models are trained on the plethora of world's knowledge, and that world's knowledge can be noisy. But even then, it can still not have the knowledge for specific domains and even less so in long tail languages. So I would say lexical domain coverage and accuracy of translation for specific domains,

Olga Beregovaya 3:43 long tail languages, right? And there is huge, I don't know what color of elephant in the room, but there is this huge thing with most of these models are still English language and English phenomena centric, right? So as you get into longer tail languages, under resourced languages, this is where the challenges are. And for instance, me working at a company that provides translation technology

Olga Beregovaya 4:06 and human translation combined, AI powered translation, we cover every language under the sun. And this is where Language AI still falls short. That would be the next one. The other one obviously is biases. I know that we speak a lot about biases, but the less covered and less represented the languages in the model, the higher the odds are that biases will be introduced.

Olga Beregovaya 4:29 There are cultural I mean, there are unintentional biases. We all know what they are. There are cultural biases, are historic biases, factual biases, all sorts of things. And they are magnified to extreme magnitude when you talk about multiple languages. And then we have good or good new hallucinations. Models do hallucinate and dealing with natural language, obviously want to have the most predictable

Olga Beregovaya 4:55 and most accurate output, even more so if you service different industries and some of them extremely sensitive to errors. And the model, it's like, I do horseback riding and it's like riding a stubborn horse with a mind of its own. And I mean, yes, you do have some controls in place, but at the end of the day, model seems to be making its own decisions as much as you think that you crafted the most accurate and most prompt that would drive the most predictable outcomes. So these would be my top four challenges. Oh, yeah. No. Sorry. Forgot. Inference time and latency. Apologies.

Conor Bronsdon 5:28 Apologies. Yeah. There is another one. I mean, there I think to your point here, there are a multitude of factors that impact our success here, whether it's purely with translation or with AI more broadly. And one of the key factors you mentioned that I think is maybe underrated is how English centric, a lot of this work is. And, there's incredible research being done in China and elsewhere, but so much of the business world, so much of the AI AI world is functioning based off of English focused models.

Conor Bronsdon 5:58 And obviously, that's great for folks like ourselves who are fluent, you know, very telling that I think this podcast is is an English language podcast. But it also means there are edge cases, to your point, where we have other languages that we're trying to translate with or trying to create in, and there are major blockers around that, or biases that that can come in. Can you expand a bit more on how this English centric nature of

Conor Bronsdon 6:24 today's AI models, particularly LLMs, is impacting your work?

Olga Beregovaya 6:29 Absolutely. So first, it's very funny because sometimes you can have the lexical coverage. You can have the right word. But when the model starts describing a phenomenon Let me try and see how I can explain it this. So you can, for instance, read an article by an Italian researcher who was receiving, give or take, fluent and smooth Italian, but the phenomena that she was trying to translate into Italian were 100%

Olga Beregovaya 6:54 American culture centric. So first things first, you do lose on factual accuracy and anthropological and phenomenological local phenomena. So that would be one thing. The other thing would again be the accuracy of translation. Right? The accuracy of translation and sometimes you get completely random translation. Quite often, neural machine translation would do much better

Olga Beregovaya 7:16 than a large language model would do. So it impacts our work also, actually models do not react well or do not often react well to prompts that are written in languages other than English. And sometimes you would have linguists and sometimes you would have bilingual linguists, of course, but sometimes you would have a linguist creating a prompt in their native language,

Olga Beregovaya 7:39 hoping for the best result. But then you quickly find out that a carefully crafted prompt in English will render a better result in the target language. So I think there are, yeah, there are multiple things. And again, factual inaccuracy is probably one of the biggest challenges. We also find ourselves fine tuning models with language specific data and domain specific data because that allows us to expand the language coverage

Olga Beregovaya 8:05 and domain coverage. Right? And this is where we are in the world of purpose built, purpose built models or models fine tuned for purpose.

Conor Bronsdon 8:14 Yeah. I think this is a really interesting conversation because while there's incredible value from English centric models and, they obviously have in like, these deep capabilities that can expand outside of English language, it also points to the edges of what's possible today and and where there's a lot of opportunity for AI in the future. And I think there's been some I'll say personally, I think it's been overblown, this idea of AGI. Like, if artificial general intelligence

Conor Bronsdon 8:43 is only working in a single lane language, is it really general? Is it general? Yeah. How general is know. Yeah. So I'm curious. You know, we talked a bit a bit about some of the challenges, the critiques, the the areas for improvement here. What about the opportunities, particularly with AI translation? Where where are you seeing these major, you know, successes recently or opportunities for growth?

Olga Beregovaya 9:07 Okay. First, maybe we'll talk about industries or domains or verticals that benefit and so far have benefited most from, introduction of language AI or rather evolution of language AI. And the opportunity those industries are usually the ones that need instant translation, instant, reasonably high quality translation for shorter shelf life content, right, content that needs

Olga Beregovaya 9:34 instant immediate publication, informative, and actionable content. And this is where we would probably talk about industries like opinion portals, right? Or like Reddit, for instance, phenomenal successes with AI based translation. So industries that have slightly higher risk tolerance, Those industries have definitely experienced the biggest successes. Obviously, multilingual chatbots are a thing,

Olga Beregovaya 10:01 but having said that, you need to properly train your multilingual chatbot, properly fine tune it. Otherwise, it's gonna for instance, if use it instead of or in addition to your knowledge base, it can quickly start giving inaccurate advice, but there are likely multiple techniques of training your multilingual chatbots on your multilingual knowledge based content.

Olga Beregovaya 10:20 So successes, main successes, as I said, be shorter shelf life, higher risk tolerance. Now this is all without human in the loop. So these would be the main successes that we've seen without human in the loop. Now when you introduce human in the loop, I think the biggest opportunities are around the way translators work these days. If we think about the evolution of translation, we go from pure human translation,

Olga Beregovaya 10:44 right, to computer assistant translation, or what we call in translation universe, CAT tools. From there, you go into computer assistant translation as a human performing the translation. Right now, I think we're in the age where AI does the heavy lifting, introduces predictability, does a lion chunk of the work, takes us much closer to human parity than ever before.

Olga Beregovaya 11:09 And the human translator or fact checker or validator really compensates for the delta that AI is not able to cover. Now, what does it mean? It means productivity. On average, a translator can deliver 2,000 words a day. That's the metric. Now we're talking about, I mean, AI and the language model, fine tuned model performing well. We're talking about magnitude and order of magnitude.

Olga Beregovaya 11:35 And then also, if you think about it, the well trained and well prompted model will give a very predictable output and reduce the potential for human error. So here you are in a package, get lowered costs, higher productivity, and much more predictable quality.

Conor Bronsdon 11:53 I appreciate these insights into the opportunities at UC, Olga. And I'd love to dive into the technical side of this a bit. What are some of the key architectural evolutions or innovations

Olga Beregovaya 12:07 that are driving progress in Language AI today? I would want to be very careful about limiting Language AI just to, large language models. Right? So I'd probably say that, I mean, first of all, one thing that the whole world, not just language AI is I wouldn't go with struggling with, but problem that everybody's solving is obviously the, computational cost,

Olga Beregovaya 12:29 right, and the GPU consumption. I mean, that's, that's that's not that's not a secret. So I would say what's obviously driving innovation and the challenge that we're always solving is how do we minimize the computational cost and how do we what can we do? And for instance, we have a researcher on my team who manages to run and get amazing results with smaller models running them off her laptop. We're not going to be exposing her to that for much longer. She's actually she's getting a she's getting a big machine, but I would imagine that whoever manages to get the best outcomes

Olga Beregovaya 13:02 from the smallest models that consume least GPU power is probably going to win. And that, again, that's something that we deal with every day. Right? And that obviously drives latency. And one of the biggest innovations, for instance, in our platform and kudos to our engineering they managed to introduce queuing, caching, multithreading, and everything else that helps reduce inference time. Because I I would imagine that, again, it's not just us, but it's many other industries that deal with inference time, model inference time, and latency.

Olga Beregovaya 13:34 So once that has been overcome, we actually that opened infinite opportunities for us to introduce generative AI and large language models into our tech stack. Because initially we service a lot of industries that, as I said earlier, require instant translation or instant turnaround. And then we say, yeah, hey, you know, we're gonna write the world's best prompt. It's gonna be X number of tokens. And by the way, it's gonna take five minutes to run. That's obviously not gonna fly. So solving for that issue was huge. Smaller models,

Olga Beregovaya 14:05 latency solving and inference, great opportunities. Again, ability to fine tune, right? Because initially models were not trainable and fine tunable. And now different fine tuning techniques is definitely it's it's it's a huge it's a huge thing for us. It's very pleasant to see consistency in presence of availability of tokenizers for different languages. Because initially, again, if you feed Swahili and all you have at your disposal is an English tokenizer, it's not gonna get you far. It's gonna give you the most disastrous results.

Olga Beregovaya 14:38 So, I would say tokenization from technology perspective and availability of tokenization. Same applies to multilingual vector space, right? If you want to phase a language, you obviously need to be sure that this language can be vectorized. So these are all They may sound like smaller things, bigger things, but if you look at them in Ensemble, that actually,

Olga Beregovaya 14:57 these are the great opportunities that help us drive language AI forward. We operate in the world where tone and voice is everything. And you have a style guide and style guides can be a 100 pages, right? You have what we call translation memory. I don't know if you are familiar with translation memories. That's basically legacy translation, right, compressed into a database and existing in the as a parallel corpus, as a bitext.

Olga Beregovaya 15:22 So you have translation memories, you have dictionaries, you have to not translate lists and style guides. And yes, you can feed it into a prompt, waste of fortune, make no money, you know, or you can actually apply Rag techniques and only fetch what's relevant from external sources. So I would say that Rag was a huge, huge, huge availability of Rag was a huge game changer. And we're definitely making the best of it in

Olga Beregovaya 15:45 our platform because we manage all those linguistic assets within the platform. And that's one of the examples. Yes, you can go out to external sources or you can actually have it all beautifully validated, clean, well prepped, well processed, well curated data, living all in the same platform at the same time, having to invest into expensive fine tuning or building 30,000 token prompts.

Olga Beregovaya 16:09 These would probably be the main drivers that I would call out. Again, language expansion, course, coverage. The more languages are covered, the better. What have you found has driven success with Reg architectures and maybe what have been some of the challenges that have been experienced? As I would say that what has driven success for us specifically, and maybe I'm speaking a little bit from

Olga Beregovaya 16:31 myopic narrow space of translation technology and translation, But what has been very successful for us is exactly this, the availability of external data sources still confined within our platform and being reliable. So I think that's the best, like, that's probably the best advantage for us, Right? When you combine, generative AI with retrieval technology,

Olga Beregovaya 16:54 that's basically where you get get the best bang for your buck. Right? Because it retrieves and there are some fun things you can do there. Like, you can retrieve based on keywords. Right? You can retrieve based on concepts. Greg is a great way of mitigating biases and hallucinations. So that's another thing when you can obviously, when you can reference a reliable curated external source that's not inherent to the model itself, that obviously adds

Olga Beregovaya 17:18 another level of assurance. A lot of great things come from combining generative AI with knowledge graphs, right? Where again, I think it's probably a face of a wreck where again, you get the best of both worlds because not only do you retrieve external information, but you retrieve it from a structured, organized environment.

Conor Bronsdon 17:38 So I hope that answers your question. Yeah. I think that's a a great call. It makes me think that we should do like an end to end REG episode at some point. Because I while it's been out as a technique for a while, there's just so much you can do with it. And so you're giving me a lot of personal inspiration here, which I am appreciating. Another thing that stuck out to me about what you were saying,

Conor Bronsdon 17:57 a couple minutes before that, was you mentioned multilingual vectorization as something that's both an opportunity and a challenge. Can you expand on that and maybe define it a bit more for our audience so we can understand the problem and and and or opportunity there?

Olga Beregovaya 18:12 Oh, the problem is very simple. The problem is if you add a language that is not covered by a specific model, you can only hope that that model can actually vectorize this particular So you a vectorization. Yeah. Yeah. And if the language is not covered, you basically yes, you will still get it. You'll get an outcome, but it will be to a great extent. There will be, as we call it in the linguistic world, maybe called in other world, in other universe as well, upload and pray. So

Olga Beregovaya 18:36 you are like, unless you have predictable vectorized output, you're really in the upload and pray territory. So I think that's the challenge. That's the challenge. And again, it's great for us to see that a lot of models and equally open source models and commercial models are getting really, really good at being able to vectorize multiple languages. The other opportunity is obviously how much information you can retrieve and the quality of translation you can get if you can actually have a multilingual or interlingual

Olga Beregovaya 19:06 vectorization, where you can actually pull information across different languages.

Conor Bronsdon 19:10 Is part of the challenge here around, I think, the English centric nature of models again, and how that drives translation, and then essentially that, the, like, the pre processing, pre vectorization of, like, the the chunking that's occurring is incorrect, and therefore the vectorization is off? Correct. Correct. It would be off, and then subsequently, everything else the model would be performing would be off. That makes total sense. Yeah. I I hadn't thought about that kind of

Conor Bronsdon 19:36 preview of what would happen here. So but, like, it I think it's similar to when you see a hallucination introduced earlier in a system, and that kind of flows throughout because every piece of the workflow after that, if it's referencing that same data, can be Correct. Inaccurate. I'm curious what's been happening on another hot topic, of this year is agentic AI, obviously. How has that been interacting with the

Conor Bronsdon 20:01 translation space?

Olga Beregovaya 20:03 I I was so ready for this question. I was like, why I was even wondering, like, how how come it took us so long to get in touch with I got excited. Got a lot of other things to talk about. Was like, oh, go for No, actually I saving it for Ladder because obviously the whole world is talking about agents, right? Whole world is And capitalizing on when we're talking about opportunities,

Olga Beregovaya 20:22 I maybe deliberately separated linguistic nature of translation of translation AI and language AI, and maybe more workflow and processing nature of, translation AI. We absolutely are working with agents and agentic AI, and I think we all know, right, that agents are give or take new phase of robotic process automation perhaps, right? Individual agents that are built for individual purposes.

Olga Beregovaya 20:49 And quite often we ask ourselves a question, what problem are we solving here? Does it call for an agent? Or do we are we doing just fine with rule based approaches?

Conor Bronsdon 21:01 First If things I need to raise money off this, I have to call it an agent. Sorry.

Olga Beregovaya 21:06 Okay. Call it an agent. So Okay. Great. So there are two things as we obviously know, there is agentic AI, which acts much more independently. And then there are individual agents that are fit for specific tasks, specific purposes. And obviously you introduce much fewer risks with just individual agents. And I'll bring one example in a second. Then if you actually trust your entire translation process with an agentic process, with an agentic independent workflow.

Olga Beregovaya 21:39 So I would say that maybe let me use one example, self healing. There are two vectors, two main vectors in the translation space and two main things people are working on. Thing one, quality of translation, getting as fast and as close as we can to human parody. But how do you know? How do you know whether you got there or you didn't get there? And that's where you go into the whole quality estimation,

Olga Beregovaya 22:05 automated quality evaluation space, which is absolutely crucial in our translation industry, right? Because we take responsibility, whether it's a B2B sector or B2C, we take responsibility for an factually accurate, actionable, and usable response. So if you combine the two, what happens? Say you've delivered the best prompt engineering or more conventional NLP techniques, and you've delivered the best output,

Olga Beregovaya 22:30 but then you run quality estimation on top of it and your quality estimator can say, Hey, you actually made mistakes here, here and here. And this is pretty inherent for instance, to our platform. Now, what a smart agent would do, would just say, Hey, let me go and self heal the whole thing. Let me execute on the task. We identified such and such errors and we would be dealing with stylistic,

Olga Beregovaya 22:52 grammatical, factual, and so on and so forth. Let me actually go back and let me actually heal everything that I've just identified. So that's something, there'll be an example that would be happening in the actual translation quality estimation and linguistic field. But it is much broader than that. A lot of decisions are made. For instance, how do you route your content?

Olga Beregovaya 23:12 How do you route it? Does it require human translation? Does it require AI with human validation? Or can it completely go through machine translation and translation with generative AI? And again, this is where you have a perfectly trained, perfectly fit for purpose agent that can make this independent decision and actually route it in the right direction. And then there are so many other things like

Olga Beregovaya 23:35 translator assignment. There are a lot of things you can do where an agent can actually make an independent decision based on has this translator delivered quality? Has this translator delivered on time? Has this translator made him or herself available? And again, this is where an agent can make an independent decision and it takes a lot of burden off supply chain management.

Olga Beregovaya 23:56 So I would say that impacted the trajectory and the areas of impact are probably very predictable and shared with other industries.

Conor Bronsdon 24:06 But this is what is specific to us, basically decision making. I appreciate you drilling down on that. That's super useful. And I think illustrative of how complex the decisions around, you know, when to implement authentic workflows or or when to focus on other areas of the technology. How important do you think access to large datasets for continued advancement

Conor Bronsdon 24:28 is gonna be in the field of translation,

Olga Beregovaya 24:32 and are there challenges around data privacy that may come into play? I would say maybe we'll start from, again, generalized foundational model. I think we've kind of solved the problem of accessing large datasets. Again, give or take, it's different for different languages, but give or take, we know how to scrape, we know how to, I mean, we know where to get that data. And I think now we are at the era of clean

Olga Beregovaya 24:55 datasets and smaller datasets. And there are a lot of experimentations and a lot of successes actually, like say taking LAMA and building only language specific with very clean dataset and a smaller dataset, smaller language model that for instance would be specific to a particular language. So I would say that I'm a big proponent and probably the entire industry is

Olga Beregovaya 25:16 maybe I mean, there are obviously thresholds, right? Like for instance, you cannot fine tune a model unless you have like your 100,000 strings. It's not gonna get you anywhere. But I think we are at the threshold where we need to strike the balance between how much data do we need, but most importantly, how clean data do we need. So I think access to cleaner datasets and curation, data curation techniques are probably even more important in the modern time and day than access to larger but potentially noisy dataset. And probably what you've seen, and I'm sure you've seen it in other industries, yes, you have all the data under the world, You start curating, you start trimming, you start pruning. And before you know it, you land with a smaller dataset,

Olga Beregovaya 25:59 but it's absolutely fit for purpose. And it absolutely is going to address your needs. Actually, ironically, was just speaking just today, I was speaking to a colleague of mine about what kind of labeled dataset do you need to feed into the language model so it can help translator with quality predictions. And her biggest struggle is complete inconsistency because between what you have in your bite text, what you have in your glossary

Olga Beregovaya 26:23 and what you have in your style guide. So dataset harmonization and dataset curation is probably more essential for our success than actual access to huge datasets. And we see smaller models becoming more and more successful, right? If you see, if you look at the recent models being released, you have X parameters, X parameters, and X parameters, and sometimes you see better performance on a smaller model.

Conor Bronsdon 26:46 So similar to this question around data sets, where it sounds like really it's more the nitty gritty of how are you labeling data, how are you explaining data that matters more, Do you think that there has been an overfocus on massive, broad models that are maybe less explainable, less interoperable, versus this opportunity for smaller models that maybe have higher degrees of explainability,

Olga Beregovaya 27:16 to be leveraged with language AI? Well, it depends on what you want. Right? I mean, the whole idea I mean, we we touched a little bit on AGI. The whole idea between, behind generalized language models is they would do everything, right? They will give you a super recipe, they will summarize your legal proceedings and everything else under the sun. And obviously they are there for a reason and that was fantastic start. But I mean, when we come to explainability,

Olga Beregovaya 27:43 we quite often see again, you think you wrote the most predictable prompt, you think that you have the most predictable fine tuned dataset. And then the model spits out the least expected output when you think that you did your absolute ultimate best. So how far did we get? Mean, are we at the age of complete AI, Gen AI explainability? I wouldn't say so. I think there's quite a bit of black box and the bigger the model, right, the larger the model, the

Olga Beregovaya 28:10 more of the black box factor do you get. Now, with chain of thought, well, pun intended, but obviously with, who is chain? Obviously with chain of thought techniques, you get a little bit more visibility into what the model does because at least it explains its behavior. But, and obviously, like the reading of the latest generation of reasoning models are even more explainable, but I'd agree with you a 100%.

Olga Beregovaya 28:34 Purpose built single purpose model would provide much more predictable outputs and its performance and its behavior would be much more explainable because you build it for this particular task and for this particular domain.

Conor Bronsdon 28:48 I'm really interested about this idea that you're bringing up of smaller purpose built models being more successful here, because it kind of goes against the long term bitter lesson of AI research, right, which was that general methods that leverage major amounts of computation are ultimately the most effective. That's what we were kind of taught for years in AI research. Now,

Conor Bronsdon 29:12 because we've hit a I don't wanna say a ceiling, but we've we've kind of reached a certain level, now we're finding so much more success with these purpose built models, these fine tuning techniques. So it's it's very fascinating to see something that I think we all expected would be true at some point starting to become true when it comes to things like translation.

Conor Bronsdon 29:34 I I guess my my question would be like, do you think that's the case, or do you think we're just kind of at a a pause moment before the next more generalized model? What's your perspective on what the future looks like here? I think we are we possibly are

Olga Beregovaya 29:47 at a pause and see what's next. As we, again, notice we follow the news, there are new models released every day. But you probably have read the same research which suggests that the more tasks the model can handle, the more the performance degrees on those individual tasks, right? Because at some point, the model becomes a jack of all trades, right, and the master of none if

Olga Beregovaya 30:10 my English is not deceiving me here. But at the same time, obviously, nobody's resting on the laurels here. And I think that maybe the smaller models, smaller purpose built models, probably just a thing that will be outperforming generalized model in the foreseeable future until the developers of larger foundational models actually have mastered language handling, translation tasks, mitigating hallucinations,

Olga Beregovaya 30:39 and all the usual suspects. So I think it is an even a simple thing like language coverage. If you're following the trajectory of different AI models, we started fairly small, right? We started with like a specific model would handle well eight to 10 languages, eight to 12 languages. Right? And now when you look at latest releases from the technical giants that are probably I would not name, but we all know who they are. You see that the language coverage expands and we are extensively testing performance across different languages. And we do see that things that are now serviced better, but by smaller purpose built models are also being fixed in the larger models because they learn their lesson. And the more AI is being implemented

Olga Beregovaya 31:24 in the translation space, the more feedback the researchers and the developers get from the field and the better they know what to fix. I also think, I mean, you touched on dataset size and number of parameters. I think we're overcoming that excitement about sucking every data that you can possibly find and whatever you do not find just augmented with synthetic data. I think that people are becoming much smarter with curation techniques.

Olga Beregovaya 31:50 So I think eventually, probably we will again, it's only my opinion, but I think eventually we'll migrate again from, okay, the smaller purpose built models are gonna, they're gonna be there, and they're gonna be there for a certain time, and they're gonna deliver,

Conor Bronsdon 32:03 but eventually the larger models will catch up. Let's switch gears a bit and talk about multilingual, multimodality, something we mentioned briefly at the the top of the episode. Can you explain for audience what that means and how it's developing?

Olga Beregovaya 32:18 So multimodality is obviously multimodality. Right? And we need to remember that the models can handle not just text and asterisk translation is not just text either, but models can equally handle images. Right? Certain model can handle videos. We, in our universe, for instance, we need to feed PDFs into the model, and sometimes it are flat PDFs. And there are a lot of

Olga Beregovaya 32:42 different other modalities that models are now capable of handling, right? And many of them, some would be, again, purpose built models and sometimes would be general models that can actually handle multiple inputs. So that, again, opens infinite opportunities for the translation space. Let me give you one example, image generation, multilingual image generation.

Olga Beregovaya 33:03 For years and years, a lot of money was spent on what we refer to as desktop publishing, extracting text and, like, translating images. Now all you really need to do is just say reproduce this image in a target language. Right? Obviously, again, provide some certain translation guidelines to, how exactly you want to translate it, persona, tone, voice, everything else. And voila, there we go. You do not need to spend tons and tons of money on desktop publishing.

Olga Beregovaya 33:30 Then there are multilingual digital humans. And we do see, maybe it's a big statement, maybe work. No, would say we're quite there yet. I mean, I would say that we're already quite there in terms of being able to support digital humans in multiple languages within a single model. If you look back in time, for instance, if we look at automated speech recognition and then text to speech, it actually took four steps,

Olga Beregovaya 33:56 right? You recognize the text, you run machine translation, you render the target translation, and then you generate the actual speech, the actual target speech. And now multilingual multimodality, you actually can conveniently fit it all within a single model. You don't need these three or four transfer Right. Okay. So that that again again very complicated because you have all the different potential routes to go down. Yeah. And every time, every step you introduce, you introduce potential fine error. Right? You're gonna have poor ASR, they're gonna have poor machine translation,

Olga Beregovaya 34:27 then you can potentially have like completely synthetically sounding generated rendered text. And now I think the multilingual and multimodal AI again presents tremendous opportunities, both for simplification of processes and for the quality of the output.

Conor Bronsdon 34:43 As we get much better at this multilingual automatic speech recognition, as we get better at the translation steps related to that, what will that enable when it comes to, as you put it, multilingual digital humans? What are we gonna see those leveraged for? Here we are in

Olga Beregovaya 35:02 a territory where I would want to thread very, very, very carefully because we are in the territory of ethics, right? And we do see, we obviously, and ethics and regulations around AI implementation. Because where it would be applicable, I mean, obviously the territory of the universe of commercials, the universe of global communications, right? The universe of pretty much, I mean, in the entertainment industry,

Olga Beregovaya 35:25 I think the avenues where this multimodal, multilingual AI can make an impact, I think they are almost obvious. A lot of places where you would traditionally require human talent can now just be delivered with the help of AI. We have the most amazing in our industry, we have the most amazing examples and most amazing companies that deliver voices and not even synthetic voices and potentially of course by consent, voice cloning. So there are a lot of things that you can actually do that would traditionally require human talent. Again, I really want to thread very carefully there, The opportunities are there, but the fine balance between what is ethical AI

Olga Beregovaya 36:06 and what actually can what the impact can be. I think that fine balance is something that we need to establish and need to establish rather sooner than later. Completely agree, especially as there are

Conor Bronsdon 36:16 a multitude of perspectives on on what's allowable here. And, yeah, I think this is another one. You're you're giving me a lot of great full length episode ideas here. Well, I'll have to to have you back talk more about ethics. I think there's a a whole conversation here, and we can do a panel, and I I think it's a a conversation that needs to be had. Olga, thank you so much for this insightful conversation.

Conor Bronsdon 36:37 I I've thoroughly enjoyed it, And it brings to mind a kind of capstone question for me, which is where are we headed? Will we soon live in a world where translations are happening seamlessly, instantaneously, wherever we need them, and language is gonna be removed as a barrier to communication. What what do you think the future looks like for translation enabled by AI over the next, call it, couple of years?

Olga Beregovaya 37:04 I I think I need to be very honest here. And, it's my opinion. And there's you could not even imagine the scale and the passion of the debate that's taking place in the translation industry at the moment. But I think just looking at the metrics that we are able to capture as an industry and in my company, that trajectory towards human parity goes upwards like this. For some languages,

Olga Beregovaya 37:28 less, for some languages, more. I mean, obviously, and actually one thing that we need to be very mindful of, one thing that changed now, we really started thinking about languages in groups, in language families, and manage expectations around language families. They're not perceived as individual languages anymore. We really know, Hey, here is what I can do for my romance languages. And probably

Olga Beregovaya 37:49 we're gonna hit human parity for those languages much faster. And then you take a highly agglutinative language, or then you take like Finno Yogurt Group, which doesn't render itself that great to any kind of natural language processing. And probably the path to human parodies be, is gonna be slower and longer, but everything in research and implementation suggests that eventually we will get there.

Olga Beregovaya 38:14 And that eventually is around the corner. And it can be three years, it can be five years, but I think language barrier as we know it is going to be erased back to multimodality, not only as text, but obviously as a shared space for different modalities, be it spoken modality or video or text or anything. Now, the question is where's the role of coming from translation industry, where's the role of a human in this? And I think

Olga Beregovaya 38:41 until biases and hallucinations have been completely solved for and until we have even language coverage, there will obviously be a role for linguists to help produce those datasets and even more importantly, fact check and validate. So I think it's a fine balance between, yes, we definitely are going to hit human parity at some point, both meaning wise and surface

Olga Beregovaya 39:06 representation wise. But it's a question of what are we gonna do with it, and then we're again, what are we gonna do as humans? What are gonna do with language as industry language language industry professionals?

Conor Bronsdon 39:16 But are we getting there? I mean, yeah. Olga, thank you so much for this amazing wide ranging conversation. Where can folks learn more about your work and follow you and the work you're doing at Smartling?

Olga Beregovaya 39:27 You know what? Actually, my biggest recommendation would be just literally, follow Smartling on LinkedIn. My colleagues and myself, we publish a lot, we speak a lot, we host webinars. So literally the best way to follow it would be, I mean, first of all, feel free to add me on LinkedIn. And second, the best way is just literally follow Smartling on LinkedIn. We're very, very diligent about keeping the

Conor Bronsdon 39:52 universe appraised of our progress on LinkedIn. I love it. We will make sure to link Olga's LinkedIn and Smartlings in the show notes. Everyone, thank you so much for checking out this episode. If you are enjoying it, make sure you're subscribed on whatever platform you are listening to. It always helps us to have your ratings, have your reviews, have your comments. We'd love to hear from you. It gives us a lot of great feedback, a lot of great data we can then feed into how we think about the show, and it helps us bring more incredible guests like Olga onto the show. And if there is someone else that you think we should have on, please reach out to me directly. You can find me on LinkedIn at Conor Bronsdon.

Conor Bronsdon 40:25 That's it for this week. Olga, thank you again. I really enjoyed this conversation. Thanks so much for having me.