artificial intelligence

What is artificial intelligence? Everything we need to know about ChatGPT and Dall-E technology

From ChatGPT to Dall-E and the new Bing; These are all based on artificial intelligence. But what exactly is artificial intelligence and how do the new wonders of the technology world work?

Artificial intelligence or AI is everywhere these days. “Intractable” problems are being solved; People with no knowledge of coding or composing or designing can build websites and songs in seconds with the help of AI and create amazing artwork. Big companies are also investing billions of dollars in artificial intelligence projects, and by bringing the ChatGPT chatbot to Bing, Microsoft is trying to overturn our search model on the Internet and maybe even disrupt the entire structure of the Internet in some time.

Getting your head around artificial intelligence, like any new technology that comes with a lot of hype and media controversy, can be confusing, and even experts in artificial intelligence can hardly keep up with the momentary developments of this technology.

In the field of artificial intelligence, a series of questions are asked; For example, what exactly is meant by artificial intelligence? What is the difference between artificial intelligence, machine learning and deep learning? What difficult problems can be easily solved now, and what problems are still beyond the capabilities of artificial intelligence? And perhaps the most popular of them; Is the world going to be destroyed by artificial intelligence?

If you have also been asked why there is so much fuss and excitement about artificial intelligence, and if you would like to learn the answers to these questions in simple language, join us to take a look behind the curtain of this mysterious and powerful technology.

What is artificial intelligence?

The term “Artificial Intelligence” or AI is used to describe a system that can perform cognitive activities related to the human mind, such as “learning” and “problem solving” as well or even better than humans. But in most cases, what we know as artificial intelligence is actually called “Automation” or the automation process, and to better understand AI, we must first understand its difference with automation.

There’s an old joke in the computer science world that automation is what we can do with computers right now, but artificial intelligence is what we wish we could do with computers. In other words, as soon as we understand how to do something with a computer, we leave the field of artificial intelligence and enter automation.

The reason for this joke is that artificial intelligence does not have a precise definition and is not even a technical term. If you look at HiNet, you’ll read that AI is “the intelligence that emerges from machines, as opposed to the natural intelligence exhibited by animals, including humans.” It means such a vague and broad definition.

In general, there are two types of artificial intelligence: strong AI and weak AI.

Strong artificial intelligence is what most people imagine when they hear AI; That is, a kind of omniscient intelligence similar to the character of Hall 9000, the same killer robot in the movie Space Odyssey or the self-aware artificial intelligence system of Skynet in the Terminator movies, which, while having superhuman intelligence and the ability to reason and think logically, also have abilities beyond humans.

What we have seen from artificial intelligence so far is weak artificial intelligence

In contrast, weak artificial intelligence are highly specialized algorithms that are designed to answer specific, useful questions and are limited to the same problem; Like Google and Bing search engine, Netflix movie suggestion algorithm or even Siri and Google Assistant voice assistant. These AI models are very impressive in their level, although their efficiency is limited.

But Hollywood sci-fi movies aside, we are still a long way from achieving strong artificial intelligence. Currently, all the AIs we know are weak, and some researchers believe that the methods that have been used to develop weak artificial intelligence will not be applicable to the development of strong artificial intelligence. Of course, if you ask the opinion of the employees of OpenAI, the developer of the popular chatbot ChatGPT, they will tell you that in the next 13 years and with the same known methods, they can achieve strong artificial intelligence!

If we want to be very precise in this matter, we must say that “artificial intelligence” is currently more of a term for attracting attention and marketing than a technical term. The reason why companies use artificial intelligence instead of using the word “automation” is because they want to conjure up in our minds the same sci-fi images from Hollywood movies. But this work is not completely clever and deceitful; If we want to joke, we can say that these companies intend to say that it is true that we have a long way to go before we reach strong artificial intelligence, but the current weak AI should not be underestimated, because it has become many times stronger than a few years ago. Well, that is absolutely true.

In some fields, there have been dramatic changes in the ability of machines, and that is due to the advances that have been made in the last few years in two fields related to artificial intelligence, namely machine learning and deep learning. You have probably heard these two terms a lot, and we will explain their mechanism below. But before that, let’s talk a little about the interesting and readable history of artificial intelligence.

History of artificial intelligence

Can machines think?

In the first half of the 20th century, science fiction introduced people to the concept of intelligent robots, the first of which was the character of the Tin Man in the novel “The Wizard of Oz” (1900). It wasn’t until the 1950s that we had a generation of scientists, mathematicians, and philosophers whose minds were engaged with the concept of artificial intelligence. One of these people was an English mathematician and computer scientist named Alan Turing, who tried to investigate the possibility of achieving artificial intelligence with mathematical science.

Turing said that humans use available information as well as the power of reasoning to make decisions and solve problems, so why can’t machines do the same? This mental preoccupation eventually led to the writing of a very famous paper in 1950 that posed the controversial question, “Can machines think?” was starting In this article, Turing described how to build intelligent machines and test their intelligence level, and by asking “Can machines excel in the imitation game?”, he initiated the very famous “Turing Test”.

The lack of memory and the staggering costs of computers prevented Turing from testing her theory

But Turing’s paper remained a theory for a few years, because at that time computers did not benefit from the key prerequisite for intelligence; That they could not save the commands and could only execute them. In other words, computers could be told what to do, but they could not be asked to remember what they had done.

The second big problem was the skyrocketing costs of working with computers. In the early 1950s, the cost of renting a computer reached 200,000 dollars per month; For this reason, only prestigious universities and large technology companies could enter this field. In those days, if someone wanted to receive funding for artificial intelligence research, it was necessary to first prove the feasibility of his idea and then get the support and approval of influential people.

The historic DSRPAI conference that started it all

Five years later, three computer science researchers named Ellen Newell, Cliff Shaw, and Herbert Simon developed the Logic Theorist software, which was able to prove the possibility of Turing’s idea of machine intelligence. Developed with funding from RAND, the program was designed to mimic human problem-solving skills.

The term “artificial intelligence” was coined by John McCarthy in 1956

Logic Theorist is considered by many to be the first artificial intelligence program. It was presented at the Dartmouth College Summer Research Project on Artificial Intelligence (DSRPAI) hosted by John McCarthy and Marvin Minsky in 1956.

ai

In this historic conference, McCarthy brought together the top researchers in various fields for an open discussion about artificial intelligence (a term McCarthy himself coined at the same event), with the idea that artificial intelligence could be achieved through collective collaboration. But the conference could not meet McCarthy’s expectations, because there was no coordination between the researchers; They came and went as they pleased and did not reach any agreement on standard methods for conducting AI research. However, all participants felt wholeheartedly that AI is achievable.

The importance of the DSRPAI conference is indescribable; Because 20 years of research in the field of artificial intelligence was based on it.

The rollercoaster of successes and failures of artificial intelligence

From 1957 to 1974, it is known as the heyday of artificial intelligence. During this period, computers became faster, cheaper, more ubiquitous and could store more information. Machine learning algorithms also improved and people knew better which algorithm to use to solve which problem.

Examples of early computer programs such as Newell and Simon’s General Problem Solver or the ELIZA software designed by Joseph Weisenbaum in 1966 and the first chatbot to successfully pass the Turing test, respectively, take scientists a few steps closer to the goals of “problem solving” and “interpretation of spoken language” brought closer.

At this time, researchers were very optimistic about the future of artificial intelligence

These successes, along with the support of prominent researchers who attended the DSRPAI conference, eventually convinced government agencies such as the US Defense Advanced Research Projects Agency (DARPA) to fund AI research at several institutions. The US government was particularly interested in developing a machine that could transcribe and translate both spoken language and data processing at high throughput.

At this time, researchers were very optimistic about the future of this field and their level of expectations was even higher than their optimism; As Marvin Minsky told Life magazine in 1970, “In three to eight years, we will have a machine with the general intelligence of a normal human.” However, even though AI has been proven feasible for everyone, there is still a long way to go before the ultimate goals of natural language processing, abstract thinking, and self-awareness in machines are achieved.

There were many obstacles on the way to the realization of these goals, the biggest of which was the lack of sufficient computing power to carry out the projects. The computers of that time did not have enough space to store a huge amount of information, nor the necessary speed to process them. Hans Morawek, McCarthy’s PhD student at the time, said that “computers back then were millions of times too weak to show intelligence.” When the patience of researchers ran out, government budgets also decreased, and for ten years, the pace of artificial intelligence research slowed down.

Until in the 1980s, two factors revived artificial intelligence research; Significant improvements in algorithms and the arrival of new funds.

Significant improvements in algorithms have given new life to artificial intelligence research

John Hopfield and David Rumelhart developed “Deep Learning” techniques that allow computers to learn new things by experimenting on their own. On the other hand, the American scientist of computer science, Edward Feigenbaum, introduced “Expert Systems” that mimic the decision-making process of experts. This system asked experts in various fields how they would react in a specific situation and then provided their answers to non-experts so that they could learn from the program.

Expert systems were widely used in industries. As part of the Fifth Generation Computing Project (FGCP), the Japanese government has invested heavily in expert systems and other artificial intelligence projects. From 1982 to 1990, Japan spent $400 million to revolutionize computer processing, implement logic programming, and improve artificial intelligence.

Unfortunately, most of these ambitious goals were not realized; But it can be seen that Japan’s FGCP project indirectly inspired a generation of young engineers and scientists to step into the world of artificial intelligence. Finally, the FGCP budget ran out and AI was once again out of the spotlight.

The defeat of the world chess champion against Deep Blue; The first big step towards the development of decision-making AI

Ironically, AI found another opportunity to grow in the absence of government funding and hype. During the 1990s and 2000s, many important goals of artificial intelligence were realized. In 1997, the chess-playing supercomputer Deep Blue made by IBM was able to defeat Garry Kasparov, the grandmaster and world chess champion. In this match, which was accompanied by great media fanfare, for the first time in history, the world chess champion lost to a computer, and it is mentioned as the first big step towards the development of an artificial intelligence program with the ability to make decisions.

In the same year, Dragon System’s speech recognition software was implemented on Windows. This was another big step in the field of artificial intelligence, but for the purposes of interpreting spoken language. It seemed that there was no problem that machines could not solve. Even human emotions were opened to machines; The Kismet robot, created in the 1990s by Cynthia Breazeal at MIT, could understand and even display emotions.

Time; Ointment for all wounds

Scientists still use the same methods for programming artificial intelligence as they did decades ago; But what happened now that we have reached such impressive achievements as ChatGPT chatbot and Dall-E image generator and Midjourney?

The answer is that engineers finally managed to solve the problem of computer storage limitations. Moore’s Law, which estimates that the memory and speed of computers will double every year, finally happened and even exceeded this limit in many cases. In fact, the reason for the defeat of Garry Kasparov in 1997 and the defeat of the Go board game champion Ke Jie in 2017 against Google’s AlphaGo program is due to this increase in computer speed and memory. This theorem explains the process of artificial intelligence research; That we develop AI capabilities to the current level of computing power (in terms of processing speed and storage memory) and then wait for Moore’s Law to catch up with us again.

The reason for the failure of humans from artificial intelligence; Increasing the speed and memory of computers

We are now living in the age of “big data”; An era in which we have the ability to collect a huge amount of information, which is extremely difficult and time-consuming to process all of them by humans. The use of artificial intelligence in various industries such as technology, banking, marketing and entertainment has solved this difficulty to a large extent. The large language models used in the ChatGPT chatbot showed us that even if the algorithms are not very advanced, big data and massive computing can help AI learn and improve its performance.

There may be some evidence that Moore’s Law has slowed down, especially in the world of chips, but the growth of information is moving at breakneck speed. Advances in computer science, mathematics, or neuroscience can all push humanity past the limits of Moore’s Law. And this means, human progress in artificial intelligence technology will not end soon.

Types of artificial intelligence

Artificial intelligence is categorized in different ways; Apart from the very general classification of weak artificial intelligence and strong artificial intelligence that we talked about at the beginning of the article, another common method divides artificial intelligence into four categories:

Artificial intelligence is categorized in different ways; Apart from the very general classification of weak artificial intelligence and strong artificial intelligence that we talked about at the beginning of the article, another common method divides artificial intelligence into four categories:

1) Reactive Machines, which are the simplest type of artificial intelligence and can only respond to current situations without using past experiences; Like Google search engine.

2) Limited Memory machines that can use some past data to improve decision making; Like the authentication system in websites.

3) Theory of Mind, which is currently a hypothetical type of artificial intelligence that can better understand the feelings, emotions and beliefs of humans and then use this information to make their own decisions.

4) Self-aware artificial intelligence, which is another hypothetical type of artificial intelligence that has reached self-awareness and can have feelings and thoughts similar to humans.

But the most practical classification of artificial intelligence, which has nothing to do with hypotheses and theories and only describes what has been achieved so far, is “Machine learning” and “Deep learning”, some of which are used in almost all intelligence systems. Today’s artificial is used.

If you have been wondering what exactly these two terms mean for a long time, but you still don’t know the exact answer to this question, don’t worry; We will try to explain these two very complicated topics in the simplest possible way.

Machine Learning

Machine learning is a special way to create artificial intelligence. Suppose we want to launch a missile and predict its landing place. Of course, this is not so difficult; Gravity is a well-established topic and the related equations can be written and calculated based on several variables, including speed and position, where the hypothetical missile will land.

But when it comes to unknown variables, it is no longer possible to find the answer to the question so easily. This time, suppose we want the computer to look at a number of images and say whether there was an image of a cat among them or not. For this question, what kind of equation can we write to describe to the computer all possible combinations of whiskers and cat ears from different angles?

This is where machine learning comes to the aid of scientists; Instead of writing the formulas and rules ourselves, we build a system that can write the rules for itself by viewing several sample photos. In other words, instead of trying to describe a cat, we show the AI lots of pictures of cats and let it figure out what is a cat and what is not a cat.

Machine learning is great for our current data-saturated world, because a system that can learn its own rules based on data can improve with more data. Want your system to become more adept at recognizing cats? Well, the internet is generating millions of cat images right now!

One of the reasons why machine learning has become so popular in the last few years is the significant increase in the amount of data on the Internet; Another reason is related to how this data is used. In the discussion of machine learning, apart from data, two other related questions are also raised:

1) How do I remember what I learned? How can I save and display the rules and relationships I have extracted from the data sample on the computer?

2) How do I do the learning process? How can I change the rules and relationships I saved in response to previous examples for new examples and improve?

In other words, what exactly is it learning from all this data?

In machine learning, choosing the type of model is very important

In machine learning, the computer representation of things learned and stored is called a “model”. Which model you use is very important, because it determines how the AI learns, the types of data it can learn from, and the types of questions it can ask.

Let’s clarify this with a simple example. Suppose we went to a fruit shop to buy figs and we want to find out which figs have arrived with the help of machine learning. It should be an easy task, because we know that the softer the figs, the more ripe and sweeter they will be. We can select several samples of ripe and ripe figs, determine their sweetness level, and then put their information on a line graph. This line is our “model”. If you look carefully, this simple line shows the idea of “the softer, the sweeter” without us having to write anything. Our nascent AI doesn’t know anything about sugar content or how ripe fruits are yet, but it can predict how sweet they are by pressing and measuring softness.

The main challenge of machine learning is to create and choose the right model to solve the problem

Deep Learning

Deep learning is a type of machine learning that uses a special type of model called “Deep Neural Networks”.

Neural networks are a type of machine learning model that uses a structure similar to neurons in the human brain to perform calculations and predictions. Neurons in neural networks are classified into different layers, and each layer performs a series of simple calculations and transmits its response to the next layer. The more layers, the more complex calculations can be performed.

Deep neural networks are called “deep” because of the large number of neuronal layers

For example, for the example of figs, a simple network with several layers of neurons is enough to predict the answer to the problem. But deep neural networks have tens or even hundreds of layers, and that is precisely why they are called deep. With all these layers, you can build infinitely powerful models that are able to learn all kinds of complex concepts by themselves without the need of rules set by humans and solve problems that computers were unable to solve before.

But apart from the number of layers, there is another factor that makes neural networks successful and that is training.

When we talk about the “memory” of the model, we mean a set of numerical parameters that govern how the model answers questions. Hence, when we talk about model training, we mean changing and adjusting these parameters in such a way that the model gives the best possible answer to our questions.

For example, with the fig model, we tried to write an equation to draw a line, which is a simple regression problem, and there are formulas that can find the answer to our question in just one step. But more complex models naturally require more steps. A deep neural network can have millions of parameters and the data set it is trained on may face millions of examples; For this model, there is no one-step solution

It is possible to start with an incomplete neural network and improve it further

Fortunately for this challenge, there is a strange trick; That you can start with a weak and incomplete neural network and then improve it by making changes. Training machine learning models in this way is similar to regularly testing students. Each time we compare the answer that the model thinks is correct with the answer that is really correct and give it a score. Then we try to improve the model and test it again.

To improve the neural network, they use the “hill climbing” method

This method makes it easier to improve the neural network. If our network has a good structure, we don’t need to start over every time new data is added. You can start with the same existing parameters and then train the model with new data. Some of today’s most prominent AI models, from Facebook’s cat image recognition tool to what Amazon Go stores use to make purchases without the need for a salesperson, are built on this simple technique.
Amazon Go chain store

In addition, with the help of the “hill climbing” method, a neural network trained for a specific purpose can be used for another purpose. For example, if you have trained your AI to recognize the image of a cat, you can easily train it to recognize the image of a dog or a giraffe.

The flexibility of neural networks, massive amounts of Internet data, parallel computing, and powerful GPUs have made the dream of artificial intelligence a reality.

It is because of this flexibility of neural networks that artificial intelligence has made great progress in the last seven or eight years. On the other hand, the Internet is constantly producing a huge amount of data, and parallel computing along with powerful graphics processors has made it possible to work with this amount of data. And finally, with the help of deep neural networks, we were able to use this data set to generate very complex and powerful machine learning models.

Thus, all the things that were almost impossible to do in Alan Turing’s time are now easily possible.

Application of artificial intelligence

Now that we are familiar with the types of artificial intelligence and their mechanism, the next question is what can we do with it now? The application of artificial intelligence is generally defined in four areas: object recognition, face recognition, voice recognition and generative networks.

Object Recognition

It can be said that the field in which deep learning has had the greatest and fastest impact is computer vision, especially in recognizing different objects in images. Just a few years ago, the state of progress of artificial intelligence in the field of object recognition was so miserable that it is well represented in the cartoon below.

ai

Man: I want that when the user takes a photo, the application can recognize that the photo was taken in a national park, for example…

Woman: Hella. I just need to take a look at JIS. It doesn’t take more than a few hours.

Man: …and whether there was a bird in the photo or not.

Woman: Well, for this I need a research team with five years of time.

Today, identifying birds and even a specific type of bird in a photo is so easy that even a high school student can do it. I mean, what has happened in these few years?

The idea of machine recognition is easy to describe, but difficult to implement. Complex objects are made from collections of simpler objects, which in turn are made from simpler shapes and lines. For example, a person’s face is made up of eyes, nose and mouth, which themselves are made up of circles, lines, etc. So, to recognize the face, it is necessary to recognize the patterns of the facial components.

Every complex object is made of a set of simpler objects and patterns; Algorithms look for these patterns

These patterns are called features and before the advent of deep learning, it was necessary to create them manually and train computers to find them. For example, there is a famous face recognition algorithm called “Viola-Jones” that has learned that the eyebrows and nose are usually brighter than the depths of the eyes; As a result, the eyebrow and nose pattern looks like a light T-shaped design with two dark spots for the eyes. The algorithm looks for this pattern to recognize faces in images.

The Viola-Jones algorithm works very well and fast, and the face recognition capability of cheap cameras is based on this algorithm. But obviously, not all faces follow this simple pattern. Several teams of leading researchers have long worked on machine vision algorithms to correct them; But they were also still weak and full of bugs.

Until machine learning came along, especially a type of deep neural network called “Convolutional Neural Network” (CNN) and brought about a great revolution in object detection algorithms.

Convolutional neural networks, or CNNs, have a special structure inspired by the visual cortex of the mammalian brain. This structure allows CNN to recognize objects in images by learning sets of lines and patterns instead of asking multiple teams of researchers to spend years finding the right patterns.

CNNs are fantastic for use in machine vision, and soon researchers were able to train them for all kinds of visual recognition algorithms, from cats in pictures to pedestrians in the view of self-driving cars’ cameras.

Moreover, the capability of CNNs due to their effortless adaptability to any dataset has led to their rapid adoption and popularity. Do you remember the hill climbing process? If our high school student wants his algorithm to recognize a certain type of bird, he just has to choose one of several machine vision networks that are freely available and open-source, and then train it on his own data set, without having to Find out the math and formulas behind the curtain of this network.

Face Recognition

Suppose we want to train a network that can not only recognize faces in general (that is, it can say that there is a human in this photo), but also can recognize who exactly this face belongs to.

For this, we choose a network that has been previously trained for general human face recognition. Next, we change the output. That is, instead of asking the network to recognize a specific face in the crowd, we ask it to show us a description of that face in the form of hundreds of numbers that may specify the shape of the nose or eyes. Since the network already knows what the components of the face are, it can do this.

Of course, we do not do this directly; Rather, we train the network by showing a set of faces and then comparing the outputs with each other. We can also teach the network how to describe the same faces that are very similar and different faces that are not at all similar.

Now face recognition becomes easy; First, we give the first face image to the network to describe it to us. Next, we feed the image of the second face to the network and compare its description with the description of the first face. If two descriptions are close, we say that these two faces are the same. Thus, we went from a network that could only recognize one face to a network that can recognize any face!

Deep neural networks are incredibly flexible

Deep neural networks are incredibly flexible

Deep neural networks are extremely useful precisely because of this flexible structure. With the help of this technology, many types of machine learning models have been developed for computer vision, and although their applications are different, many of their main structures are based on early CNN networks such as Alexnet and Resnet.

It is interesting to know that some people have used facial recognition networks even to read the lines of time charts! That is, instead of trying to create a custom network for data analysis, they train the text game neural network in such a way that it can look at the shape of the lines of the graphs similar to human faces and describe the patterns.

This flexibility is great, but eventually it comes up short. For this reason, solving some problems requires another type of network, which you will learn about later.

speech recognition

Speech recognition techniques can be said to be similar to face recognition, in that the system learns to look at complex things as a set of simpler features. In the case of speech, the recognition of sentences and phrases is obtained from the recognition of words, which themselves follow the recognition of syllables or, more precisely, phonemes. So when we hear someone say “Bond, James Bond” we are actually listening to a sequence of sounds consisting of BON+DUH+JAY+MMS+BON+DUH.

In the field of machine vision, features are organized spatially, and the structure of CNN is supposed to recognize these places. But in the case of speech recognition, features are classified temporally. People may speak slowly or quickly, without knowing where they start or end their speech. We want a model that, like humans, can listen to sounds at the moment they are made and recognize them; instead of waiting for the sentence to be completed. Unfortunately, unlike physics, we cannot say that space and time are the same and end the story here.

If you have worked with your phone’s voice assistant, it has probably happened many times that Siri or Google Assistant misunderstood your words due to the similarity of syllables. For example, you tell Google Assistant “what’s the weather”, but it thinks you asked it “what’s better”. To solve this problem, we need a model that can pay attention to the sequence of syllables in the context of the text. This is where machine learning comes in again. If the set of spoken words is large enough, it can learn which are the most likely utterances, and the more examples there are, the better the model’s predictions.

The RNN’s memory allows the network not only to “listen” to individual syllables as they are spoken, but also to learn what kinds of syllables go together to form a word, and to predict what kinds of phrases and sentences are more likely. . As a result, the RNN network teaches the voice assistant that it is more likely to say “what’s the weather” than “what’s better” and responds accordingly.

With the help of RNN, human speech can be well recognized and converted into text; The performance of these networks has improved so much that they are even better than humans in terms of detection accuracy. Of course, sequels don’t just appear in audio. Today, RNN networks are also used to recognize the sequence of movements in videos.

Deepfakes and Generative AI

So far, we have only been talking about machine learning models that are used for diagnosis; For example, we wanted the model to tell us what he sees in this picture or to understand what was said. But these models have more features. As you probably realized from working with chatbots and the Dall-E platform, deep learning models can also be used for content creation these days!

You must have heard the name Deep Fake a lot; Fake videos where celebrities say or do things that sound real, but aren’t. Deepfake is another type of artificial intelligence based on deep learning that takes audio and video content and changes it at will so that the final result is something completely different from the original content.

deepfake

Check out this deepfake video; The model used in the creation of this deepfake can analyze a person’s dance video and then by finding patterns, implement the same rhythmic movements in the second video on another person; So that the person in the second video dances exactly like the first video.

With all the techniques we’ve described so far, it’s quite possible to train a network that can take an image of a person dancing and tell where their hands and feet are. The network also learned how to relate the pixels of an image to the position of the hands and feet. Considering that, unlike the real brain, the neural network of artificial intelligence is only data stored in a computer, it is undoubtedly possible to take this data and do the opposite of this process; It means to ask the model to obtain the pixels from the position of the hands and feet.

Machine learning models that can create deepfakes or, like Dall-E and Midjourney, convert descriptive text into images, are called generative models. Until now, every model we talked about was of the discriminator type; This means that the model looks at a set of images and recognizes which image is a cat and which is not; But the generative model, as the name suggests, can generate an image of a cat from a textual description of a cat.

Generative models built to “image” objects use the same CNN structure used in object detection models and can be trained in exactly the same way as other machine learning models.

The challenge of building a generative model is to define a scoring system for it

But the challenging point of training generative models is defining a scoring system for them. Cleaner models are trained with correct and incorrect answers; For example, if they recognize the picture of a dog as a cat, they can be taught that the answer is incorrect. But how to rate the model who drew the picture of a cat? For example, how good is the drawing or how close is it to reality?

This is where the story gets really scary for the pessimists of the future and technology, I mean those who believe the world is going to be destroyed by killer robots. Because the best method we currently have for training generative networks is to let another neural network train them instead of training them ourselves; It means two artificial intelligences on top of each other!

For people who believe in the future of killer robots, GANs make the story scary

The name of this technique is “Generative Adversarial Networks” or GAN. In this method, we have two neural networks that work against each other; On the one hand, we have a network that tries to make a fake video (for example, take the location of a dancing person’s hands and feet and put it on another person) and on the other hand, there is another network that is trained to use a set of dance samples. Real, recognize the difference between real and fake video.

In the next stage, these two networks face each other in a kind of competitive game, which is where the word “adversarial” comes from. The generator network tries to create convincing fakes and the cleaner network tries to distinguish what is real and what is fake.

In each round of training, the models get better and better. It’s like pitting a jewel forger against a seasoned expert, and now both want to beat their opponent by getting better and smarter. Finally, when both models are sufficiently improved, the generative model can be used independently.

Productive models are great in producing content, whether image, audio, text or video; For example, the chatbot ChatGPT, which has made a lot of noise these days, uses a large language model based on the generative model and can respond to almost all user requests, from producing poems and screenplays to writing articles and code, within a few seconds; Also, the answer was not written by a human in such a way that it cannot be recognized.

The use of GAN networks is scary (of course for very skeptical and pessimistic people!) because the role of humans in training models is limited to that of an observer, and almost the entire learning and training process is in charge of artificial intelligence.

Examples of artificial intelligence

These days, artificial intelligence can be seen in almost everything; From voice assistants like Siri and Alexa to movie and song recommendation algorithms in Netflix and Spotify, self-driving cars and robots working in the production line. But in recent times, the release of some examples of artificial intelligence have made people talk about this field of technology, which we will briefly mention below.

ChatGPT

ChatGPT is an experimental chatbot or rather the best chatbot ever made available to the public. This chatbot, released in November 2022 by OpenAI, is based on version 3.5 of the GPT language model.

chatgpt

Much has been said about the wonders of ChatGPT. By typing their requests in the very simple user interface of this chatbot, users get amazing results; From producing poems, songs and screenplays to writing articles and code and answering any question you can think of; And all this is done in less than ten seconds.

The amount of data that ChatGPT was trained with is so vast that it would take “a thousand years of human life” to read it all. The data hidden in the heart of this system contains an infinite amount of knowledge about the world we live in, and therefore can answer almost all of our questions.

DALL-E

The DALL-E image generation platform, named after the combination of surrealist painter Salvador Dali and Pixar’s WALL-E animation, is one of the most interesting products developed at OpenAI, where user text requests are transformed into amazing works of art in seconds. .

dall-e-image

The first version of DALL-E was developed based on the GPT-3 model and was limited to creating images with dimensions of 256 x 256 pixels. But the second version, which entered the private beta phase in April 2022, is considered a big leap in the field of image generators based on artificial intelligence. The images that DALL-E 2 is capable of creating are now 1024 x 1024 pixels and use new techniques such as “inpainting” where parts of the image are replaced by another image at the user’s choice.

The magic of DALL-E and other generators like it is not only in recognizing objects individually but in their extraordinary understanding of the relationships between objects; So when you ask it to create “astronauts on horseback”, it knows exactly what you mean.

Currently, people who have access to ChatGPT can also use the Dall-E platform.

Copilot

In 2018, in addition to acquiring the GPT-3 license, Microsoft partnered with OpenAI through the GitHub platform to develop the Copilot AI tool. Copilot runs inside the code editor and helps developers write code.

Copilot is free to use for verified students and open source project managers, and according to GitHub, nearly 40% of the code in files with Copilot enabled is written with the tool. Copilot is developed from OpenAI’s Codex model, which is a generation of the flagship GPT-3 algorithm.

Jukebox

The Jukebox system is truly amazing. It is enough to give this bot the genre of the song, the name of the artist and the lyrics of the song, and it will generate a sample of a new song for you from zero to one hundred. In the OpenAI Soundcloud profile, you can listen to samples of Jukebox AI-generated songs. According to this company, the lyrics of the songs were written by a language model and a number of researchers.

Except for Jukebox, Google’s new artificial intelligence tool called MusicLM is also able to generate songs based on text description; However, this tool is not yet available to the public.

According to Google, MusicLM has been trained with a total of 280,000 hours of music data to learn to produce coherent and complex songs based on the descriptions it receives. For example, this tool can create very high-quality songs by giving the command “jazz song with a saxophone solo and a solo singer” or “90s techno song with low bass and powerful beats”. The output of this artificial intelligence is very impressive and is similar to the music made by human artists.

Midjourney

Like Dall-E, Midgerni is an interactive bot that uses machine learning to create text-based images. This platform can be used on Discord and its free version allows users to make a few limited requests. All the requests of other users and the images produced by Midjourney can be seen in the Discord channel of this platform.

igor pantic

One of the charms of Midgerni is making different types of the same image. In this way, by putting the images together, you can make an attractive animation in the style of “stop motion”. According to some, the images produced with Midgerni have more quality and creativity than DALL-E.

New Bing

“New Bing” is actually the well-known and of course unpopular search engine of Microsoft, which is now equipped with a very powerful artificial intelligence model to be a new attempt to end the monotony of the Google search engine for several years and to completely turn the way we search on the Internet. And as Microsoft hopes, it will do better than before.

microsoft-ai-powered

If you are surprised by the capabilities of ChatGPT, you will probably be more surprised by the version used in Bing; Because Microsoft says that the language model used in Bing is GPT-4, which is equipped with 700 billion parameters. In addition, the Bing chatbot is connected to the Internet and its information is always up-to-date.

In the new Bing, you can ask your question in natural language and the artificial intelligence will start answering in the same natural language. Microsoft says that this model of responding to user requests is more practical and useful than traditional search.

LaMDA

Like ChatGPT, it’s a machine learning-based chatbot designed to talk about any kind of topic. This chatbot, which stands for Language Model for Dialogue Applications, is based on the transformer neural network architecture that Google designed in 2017; A network that is exactly used in the construction of ChatGPT.Google still refuses to publicly release Lambda; But last year, this chatbot made headlines after one of the Google employees claimed to have become self-aware. In a controversial claim that led to his firing from Google, this person said LaMDA has feelings and mental experiences; Therefore, it is self-aware.The claim that LaMDA is self-aware has been strongly denied by both Google and artificial intelligence experts. Honestly, artificial intelligence technology is still far from reaching self-aware systems; A distance that many experts believe reaches 50 years.

PaLM

PaLM stands for Pathways Language Model, another language model from Google that is much more complicated than Lambda.

Google unveiled PaLM at I/O 2022 at the same time as LaMDA 2, which has just been made available to developers. This model can handle things that LaMDA cannot do; Things like solving math problems, coding, translating C programming language to Python, summarizing text and explaining jokes. What surprised even the developers themselves was that PaLM can reason, or more precisely, PaLM can execute the reasoning process.

It is equipped with 540 billion parameters, which is four times more than LaMDA and three times more than the GPT-3 language model used in ChatGPT. Due to benefiting from such a wide set of parameters, PaLM can perform hundreds of different tasks without the need for training, and some may even be tempted to consider this model the closest human achievement to “strong artificial intelligence”, since it can perform any thought-based task that Humans can do it without special training.

The dangers of artificial intelligence

Artificial intelligence is like gray characters in stories, it is not 100% evil and not 100% savior and superhero. While it makes human life simpler and complex and expensive technologies more accessible, it can also bring risks and challenges, some of which we will mention below:

The loss of some jobs due to automation; Since 2000, artificial intelligence and automation systems have eliminated 1.7 million manufacturing jobs. According to the World Economic Forum’s 2020 Future of Jobs Report, artificial intelligence is expected to replace 85 million jobs worldwide by 2025. Jobs such as data analytics, telemarketing and customer service, coding, transportation and retail are at risk of being completely replaced by artificial intelligence.

social manipulation through algorithms; Artificial intelligence can influence people’s opinions, behaviors and feelings through online platforms such as social networks, news media and even online stores. AI can also harm people by producing fake or misleading content such as deepfake videos.

social monitoring with artificial intelligence; With the help of facial recognition technology, location tracking and data mining, all of which are based on artificial intelligence, governments and companies can engage in mass surveillance of citizens and employees. This issue threatens people’s privacy, security and civil liberties.

biases caused by artificial intelligence; AI can inherit or reinforce human biases in its data or design. These biases can lead to unfair or discriminatory outcomes for certain groups of people based on race, gender, age, etc.

the spread of socioeconomic inequality; AI can create a digital divide between those who have access to its benefits and those who do not. AI can also widen the gap between rich and poor by concentrating wealth and power in the hands of the few who control AI systems.

autonomous weapons; Artificial intelligence can be used in the development of autonomous lethal weapons that can fire at targets without human intervention. While some say that by replacing human soldiers with robots, the number of casualties of the country possessing these weapons will decrease, having an army that does not cause casualties on the hands of a more advanced country gives that country more motivation to start a war.

The future of artificial intelligence

Until a few years ago, the future of artificial intelligence was chatbots and image generators such as ChatGPT and Midjourney, which have been available to the public for some time and are expected to achieve significant improvements in the next few years. For example, the company OpenAI is working on the fourth version of the GPT big language model, which Silicon Valley people claim is going to do wonders in the world of chatbots. At one time, the idea that two people with two different languages could talk to each other and understand each other at the same time was only possible in science fiction stories and Mass Effect games; But it is not unlikely that artificial intelligence will turn such an idea into reality.

artificial-interlligence

As it turns out, artificial intelligence is the most important technology of the future and many scenarios have been defined for its development; Including:

Artificial intelligence will further integrate with human intelligence and increase our capabilities; For example, brain-computer interfaces, natural language processing and machine vision can enhance our communication, learning and perception.

The ultimate goal of all artificial intelligence projects is to achieve AGI

Artificial intelligence becomes more autonomous and more adaptable to complex environments; For example, self-driving cars, smart homes, and robotic assistants can operate with minimal human supervision or intervention.

Artificial intelligence will become more creative in producing content or providing new solutions; For example, competitive generative networks, algorithms, and natural language generation can produce realistic images, artwork, music, or text.

Artificial intelligence is entering into more cooperation with other agents, whether human or machine. For example, multi-agent systems (MAS), swarm intelligence, and reinforcement learning can enable collective decision-making, problem-solving, and coordination.

And of course artificial intelligence will become more diverse and comprehensive in the discussion of data sources, design principles, applications and effects. For example, we can mention advances in responsible artificial intelligence, explainable artificial intelligence (AI) that reveals the complex patterns of intelligent learning to humans, and fair artificial intelligence and reliable artificial intelligence.

But the ultimate goal of all people who work in the field of artificial intelligence is to achieve a strong artificial intelligence or a machine that can surpass human intellectual capabilities in all activities. That is, something like the self-aware robots we see in movies. Of course, there is still a long time to reach such a level of artificial intelligence; If you ask the opinion of OpenAI employees, they will tell you that they will reach strong artificial intelligence in the next 13 years, but most experts in this field are betting on 50 years.

Will artificial intelligence destroy humanity?

Well, with all these words and significant progress made in the field of artificial intelligence, how much longer should we expect the appearance of killer robots like Skynet in Terminator movies or Hall 9000 in Space Odyssey?

If you watch wildlife documentaries, you’ve probably noticed that at the end of all of them, there are people talking about how all this magnificent beauty is about to be destroyed by humans. For this reason, I think that any responsible discussion about artificial intelligence should also talk about its limitations and social consequences.

The success of AI depends heavily on the models we choose to train them

First, let’s re-emphasize the current limitations of artificial intelligence; If there’s only one thing I hope you’ve learned from reading this, it’s that the success of machine learning or artificial intelligence is highly dependent on the models we choose to train them. If humans build these networks without following basic standards and principles or use wrong and misleading data to train AI, then these problems can have disastrous effects.

ai-war

Deep neural networks are very flexible and powerful, but they are not magical. Although it is possible to use deep neural networks for both RNN and CNN, it should be noted that the underlying structure of these two networks is very different and up to this point humans have needed to define them in advance. So, although a CNN trained to recognize cars can be retrained to recognize birds, this model cannot be used to understand speech.

Simply put, it’s like we understand how the visual cortex and the auditory cortex work, but we have absolutely no idea how the cerebral cortex works and where to even begin to understand it. And that means we probably won’t be getting Hollywood-style humanoid artificial intelligence anytime soon. Of course, this does not mean that current artificial intelligence cannot have negative social effects. Therefore, getting to know the basic concepts of artificial intelligence is perhaps the least that can be done to find a way to solve the problems of artificial intelligence (and prevent the destruction of the earth!).

Leave a Comment

Your email address will not be published. Required fields are marked *