If there’s one thing all the tech giants have in common today, it’s the race for the best Artificial Intelligence (AI) solutions. Microsoft just invested $1 billion into OpenAI, a San Francisco-based research lab founded by Silicon Valley influencers such as Elon Musk, intending to build the ultimate AI technology: Artificial General Intelligence (AGI).

AI research carried out at labs like OpenAI and DeepMind is at the forefront of the technology’s development. Keeping a close eye on the challenges tackled by these researchers offers a glimpse into the future trends of AI.

In this article, we cover three research projects that explore different areas of AI, generating some very promising results.

Ready to dive into the future? Let’s get started!

1. AlphaZero – mastering the games of chess, shogi and Go

alpha zero

Research lab: DeepMind

What does it do?

AlphaZero is a program designed to master the games of chess, shogi and Go. It managed to achieve a superhuman level of play in all three games in 24 hours, defeating world-champion software (Stockfish and elmo). This approach was also used in AlphaZero Go, which powered the first machine to defeat a professional human Go world champion.

Samples

Have a look here to see the chess games of AlphaZero. 

Technology

AlphaZero is a more generalized variant of the AlphaGo Zero algorithm. It can play Go, but also chess and shogi. However, AlphaZero has hard-coded rules for setting search hyperparameters. Unlike chess, Go is symmetric under specific rotations and reflections – and AlphaGo Zero was developed to take advantage of that (while AlphaZero wasn’t). And unlike Go, chess can end it a draw. That’s why AlphaZero needs to take the possibility of a drawn game into account.

Using 5000 custom tensor processing units (TPUs) to generate games and 64 second-generation, the AlphaZero neural networks were trained via self-play, with no access to endgame tables or open books.

AlphaZero was trained and then periodically matched against its benchmark including Stockfish, elmo, and AlphaGo Zero. DeepMind did that to determine how the training was progressing. The researchers assessed that the algorithm’s performance surpassed that benchmark after around four hours of training for Stockfish, two hours for elmo, and eight hours for AlphaGo Zero.

2. GPT-2 – taking Natural Language Processing (NLP) to the next level

gpt codete future trends in ai

Research lab: OpenAI

What does it do?

The successor to GPT, GPT-2 is a language model trained to predict the next word – it generates synthetic text samples in response to a specific input, offering unprecedented quality.

How did OpenAI researchers manage to achieve that? GPT-2 is a massive transformer-based language model with 1.5 billion parameters, trained on a vast dataset comprising 40GB of text scraped from the Internet. The model can flexibly adapt to the style and content of the conditioning text. As a result, users can generate coherent continuations of text about a given topic.

Samples

Include a screenshot of a text sample: https://openai.com/blog/better-language-models/#sample2

Technology

Experts agree that GPT-2 generates impressive results that mark a leap beyond the achievements of the existing language models. However, the technique involved here isn’t anything new. GPT-2 is a generative pre-trained Transformer, pre-trained using traditional language modeling. OpenAI researchers achieved their breakthrough by feeding the algorithm with more training data than ever. Recent advancements in the NLP field resulted from similar approaches.

To feed GPT-2 algorithm with data, researchers created a new dataset that emphasized content diversity, scraping it from the Internet. To maintain document quality, they only used pages created or filtered by humans – specifically, the outbound links from Reddit that have received at least 3 karma (showing that other users have found the content interesting). The idea behind this was ensuring higher data quality than similar datasets like CommonCrawl.

GPT-2 was trained with one goal in mind: predicting the next word, given all of the previous words within a text. GPT-2 includes ten times more parameters and was trained on over 10x the amount of data than its predecessor, GPT. That’s why GPT-2 outperforms other language models trained on specific domains like Wikipedia without requiring to use these domain-specific training datasets.

When it comes to tasks like answering questions, comprehending texts, and translation, and summarization, GPT-2 learns them from raw text, without task-specific training data – suggesting that such tasks can benefit from unsupervised techniques then offered sufficient unlabeled data and computing power.

Note: Due to safety concerns, researchers decided not to release the trained model. Instead, they released a much smaller model for experimentation and a technical paper.

As if in response to GPT-2, researchers from Harvard University and the MIT-IBM Watson AI Lab have recently built a new tool that allows identifying text generated using AI. The Giant Language Model Test Room (GLTR) takes advantage of the fact that such text generators rely on statistical patterns in text, not words or sentence meaning. That’s how it can tell if the words seem too predictable to have been written by a human.

3. MuseNet – generating artificial music

Musenet - Trends in AI

Research lab: OpenAI

What does it do?

MuseNet is a deep neural network able to generate 4-minute songs and musical compositions in a few seconds. The program uses ten different instruments in combination with styles ranging from country and classical to pop and rock. In total, users can choose from 15 styles.

Samples

Here are some custom songs created by users of MuseNet:

Technology

MuseNet wasn’t programmed explicitly to match the human understanding of music. Instead, the program discovers patterns in rhythm, style, and harmony. MuseNet accomplishes that by learning to predict the next token in the massive set of MIDI files.

MuseNet was trained using the same general-purpose unsupervised technology as GPT-2 – which also predicts the next token in a sequence. What GPT-2 does for text, MuseNet does for audio, creating some interesting musical pieces in the process.

The program generates each note by calculating the probabilities across all the possible instruments and notes. That’s why the instruments users ask for work like strong suggestions, not requirements. The model adapts the choice of instruments to the more likely one, so it might choose something else than designated by the user. The tool also finds pairing unlikely styles and instruments challenging – for example, matching Mozart with bass and drums.

Conclusion

The AI scene is developing at an increasing pace as both tech giants, and brand-new startups drive innovation and experiment with new technologies. That’s why we should all pay attention to these efforts as they serve as the driving force that shapes the future of this incredibly diverse field.

Would you like to unlock the value of AI for your business?

Get in touch with our consultants; we help companies make the most of innovative technologies.

Software Engineer

I am a big fan of AI and applying machine learning methods in real-life problems, with an experience in web development and databases. Currently, I'm involved in Big Data projects as well as in internal research at Codete.