By Weihao Gan
On April 10th, our department is hosting our first Emerging Trends seminar. Meant to bring together the diverse areas of our department around a single topic and encourage collaboration, the first seminar features Professor Jay Kuo, who will speak about his work with neural networks and deep learning. To highlight Neural Networks, we’ll be running three blogs this week by different department community members.
In this first blog, Dr. Kuo’s PhD student, Weihao Gan, answers some questions about what neural networks are, what he’s working on in the lab, and how they are already changing the world.
1. What are Neural Networks?
A neural network is a computational architecture based on a large collection of neurons, meant to mimic the behavior of a human brain’s axons. The brain works by transferring and processing a perceived signal through millions of nerve cells. This is the way in which we understand the world and our surroundings.
Today, the modern neural network architectures typically consist of multiple layers and neurons in different layers that are connected together. An effective neural network needs the same thing that a growing child’s brain needs: exposure to large amounts of information.
There are two major types of modern neural network models: convolutional neural networks (CNN) and recurrent neural networks (RNN). They each have different strengths — CNN is better at handling 2D images, while RNN is more effective at sequential signals such as speech.
2. What are they being used for now?
CNNs is commonly being used for things like:
- Detecting and recognizing different objects in an image/video
- Segmenting image regions into different groups
- Understanding the content and structure of indoor/outdoor scenes
- Analyzing the behavior of people/objects in videos
- RNNs are focused on other areas:
- Translating from one language to another
- Recognize speech from different people
- Describe the image/video using sentence
Researchers are also combining CNNs and RNNs together to improve the performance of some tasks such as image/video captioning.
3. What are you working on?
I’m working on visual object tracking. The goal of tracking is to find the target location using a bounding box in the video given the first frame location initialization. This is a very important but challenging topic in the field of computer vision because of the dynamic nature of a target in a video. The size, shape, orientation, and illumination, of an object all change constantly as it moves around.
With the development of modern neural networks, trackers based on CNN outperforms other traditional visual trackers by a large margin. This is a very typical example of how deep CNN architecture completely dominates in the computer vision field.
I’m also working on the CNN-based tracking solution. This tracking technique is commonly used in many real-world applications like traffic control in video surveillance, autonomous vehicles, missile tracking and navigation in military defense, medical imaging and more.
4. What are some future applications?
Neural networks have great potential applications in the future. The performance of these tasks will keep improving and they can be applied to things like autonomous vehicles, robotics and more. Think of a scenario in which you get into an automated car and while you’re inside you chat with friends, watch a movie and even sleep while the car takes you to your destination automatically – this is one major way neural networks will influence our lives in the very near future. Another cool application is in-house robots that can clean, cook, and take care of babies, the elderly, or people with disabilities or injuries. It’s really amazing!
There are also some very important potential applications in other fields. Neural networks have the potential to be trained to model the behavior of genes, drugs, and proteins. They can then be used to design new medicines. We may also use them to predict coming attacks from things like the flu.
Education will also be influenced heavily by neural networks. A computer with a large enough scale of training data has the potential to teach children in different subjects, such as math, language and even design.
5. What is the single biggest way it could change the world?
It’s difficult to point out the single biggest way it could change the world because this deep neural network technique can be applied to nearly everything. We will see a lot of amazing things in our lifetime and these changes will definitely provide tremendous advantages and conveniences for us: Transporters won’t need to drive through the night. Workers won’t need to go into dangerous places. With the push of a button or one simple language command, an artificial intelligence agent will do so much for us.
However, many people have valid concerns about the effects this technology will have on society. One fact we cannot deny is that machines are already replacing many jobs formerly done by humans.
As engineers, we’re responsible for creating a lot of the technology that changes the world. Therefore, we also have a profound responsibility to think about society and the human impact of our work. While we must embrace new technologies and see their potential for good, we must also be prepared to address the changes they will bring.
Weihao Gan is a PhD student in Dr. Jay Kuo’s Media Communications Lab at the Ming Hsieh Department of Electrical Engineering at USC. His research interests include computer vision, machine learning and visual perception.