AI that solves math problems, translates 200 languages, and draws kangaroos – TechCrunch

The research in the field of machine learning and AI, now a key technology in virtually every industry and every company, is far too extensive for anyone to read in full. this column, perceptronaims to gather some of the most relevant recent discoveries and articles – particularly in, but not limited to, artificial intelligence – and explain why they are important.

In this batch of recent research, Meta has open sourced a language system that it claims is the first capable of translating 200 different languages ​​with state-of-the-art results. Not to be outdone, Google detailed a machine learning model, Minervawho can solve quantitative reasoning problems including math and science problems. And Microsoft has published a language model, Godelto generate “realistic” conversations that match those widely used by Google Lambda. And then we have some new text-to-image generators with a twist.

Meta’s new model, NLLB-200, is part of the company’s “No Language Left Behind” initiative to develop machine translation capabilities for most of the world’s languages. NLLB-200 is trained to understand languages ​​such as Kamba (spoken by the Bantu ethnic group) and Lao (the official language of Laos), as well as over 540 African languages ​​that are not supported well or at all by previous translation systems that are currently being used to translate languages ​​in the Facebook newsfeed and on Instagram in addition to the Wikimedia Foundation’s content translation tool, as Meta recently announced.

AI translation has the potential to scale greatly – and already does Has scaled – the number of languages ​​that can be translated without human expertise. But as some researchers have noted, mistakes involving incorrect terminology, omissions, and mistranslations can occur in AI-generated translations because the systems are trained largely on data collected from the internet — not all of which is of high quality. For example, Google Translate once assumed that doctors were male and nurses were female, while Bing’s translator translated phrases like “the table is soft” into German as the female “the table.”

For NLLB-200, Meta said it has “completely revamped” its data cleansing pipeline with “key filtering steps” and toxicity filter lists for the entire set of 200 languages. It remains to be seen how well it works in practice, but — as the meta-researchers behind NLLB-200 acknowledge in a scholarly paper describing their methods — no system is entirely free of bias.

Similarly, Godel is a language model trained on a large amount of text found on the Internet. However, unlike NLLB-200, Godel was designed to engage in “open” dialogue – conversations about a range of different topics.

Photo credit: Microsoft

Godel can answer a question about a restaurant or have a back-and-forth dialogue about a specific topic, e.g. B. the history of a neighborhood or a current sports game. Usefully, and like Google’s Lamda, the system can draw on content from across the web that was not part of the training data set, including restaurant reviews, Wikipedia articles, and other content on public websites.

But Godel encounters the same pitfalls as NLLB-200. In an article, the team responsible for its creation notes that it “can produce deleterious reactions” due to the “forms of social bias and other toxicity” in the data used to train. Eliminating, or even mitigating, these biases remains an unresolved challenge in the field of AI—a challenge that may never be fully resolved.

Google’s Minerva model is potentially less problematic. As the team behind it describes in a blog post, the system learned from a dataset of 118 GB of scientific papers and websites using mathematical expressions to solve quantitative reasoning problems without using external tools like a calculator. Minerva can generate solutions that involve numerical computations and “symbolic manipulation,” achieving leading performance on popular STEM benchmarks.

Minerva is not the first model designed to solve these types of problems. To name a few, Alphabet’s DeepMind shown several algorithms that can help mathematicians with complex and abstract tasks, and OpenAI has experimented with a system trained to solve elementary school-level math problems. But Minerva is incorporating newer techniques to better solve math questions, the team says, including an approach where the model is “prompted” with multiple step-by-step solutions to existing questions before presenting it with a new question .


Photo credit: Google

Minerva still makes his fair share of mistakes, sometimes arriving at a correct final answer, but with flawed reasoning. Still, the team hopes it will serve as the basis for models that “help push the boundaries of science and education.”

The question of what AI systems actually “know” is more philosophical than technical, but how they organize that knowledge is a fair and relevant question. For example, an object recognition system may show that it “understands” that domestic cats and tigers are somewhat similar by intentionally overlapping the concepts in identification — or perhaps it doesn’t really understand it and the two types of creatures are entirely unrelated .

Researchers at UCLA wanted to see if language models “understood” words in this sense, and developed a method called “semantic projection” which suggests that they do. While you can’t just ask the model to explain how and why a whale is different from a fish, you can see how closely it connects those words to other words, like mammal, big, Scaleetc. When cetaceans are strongly associated with mammals and large scales but not scales, you know he has a good idea of ​​what he is talking about.

An example of where animals fall within the spectrum designed by the model from small to large.

As a simple example, they found that animal coincided with the concepts of size, sex, danger, and wetness (the choice was a bit odd), while states coincided with weather, wealth, and partisanship. Animals are non-partisan and states are genderless, hence all traces.

Currently there is no surer test of a model’s understanding of some words than asking them to draw them – and text-to-image models are getting better and better. Google’s Pathways Autoregressive Text-to-Image or Parti model seems to be one of the best yet, but it’s difficult to compare it to the competition (DALL-E et al.) without Access, which few of the models do Offer . In any case, you can find out more about the Parti approach here.

An interesting aspect of the Google report shows how the model performs with an increasing number of parameters. See how the picture gradually improves as the numbers increase:

The prompt read: “A portrait photo of a kangaroo wearing an orange hoodie and blue sunglasses, standing on the lawn in front of the Sydney Opera House, holding a sign on his chest that reads Welcome Friends!”

Does this mean that the best models will all have tens of billions of parameters, meaning they will take forever to train and run only on supercomputers? Sure for now – it’s kind of a brute force approach to making things better, but the ‘tick tock’ of the AI ​​means the next step isn’t just making them bigger and better, but making them smaller and to make it more equal. Let’s see who can do it.

Not one to leave out of the fun, Meta also showed off a generative AI model this week, but one that it claims gives more impact to the artists who use it. Having played with these generators a lot myself, it’s part of the fun to see what comes out of it, but they often come up with nonsensical layouts or don’t “understand” the prompt. Meta’s Make-A-Scene aims to fix that.

Animation of different generated images from the same text and sketch prompt.

It’s not quite an original idea – you paint a basic silhouette of what you’re talking about and it uses that as a base to create an image about it. We saw something like that in 2020 Google’s nightmare generator. This is a similar concept but scaled up to allow realistic images to be created from text prompts, using the sketch as a basis but with plenty of room for interpretation. Might be useful for artists who have a general idea of ​​what they are thinking of but want to include the unlimited and weird creativity of the model.

Like most of these systems, Make-A-Scene is actually not available for public use, being quite computationally intensive like the others. Don’t worry, we’ll be getting decent versions of these things at home soon.

AI that solves math problems, translates 200 languages, and draws kangaroos – TechCrunch Source link AI that solves math problems, translates 200 languages, and draws kangaroos – TechCrunch

Related Articles

Back to top button