If you’ve heard about neural networks and are not sure what they are, if you’re interested in artificial intelligence and how it ‘learns by itself’ to perform complex tasks for autonomous driving, healthcare, product quality assurance etc., or if you are already in the field and want to learn more, then I hope this blog is for you!
A quick bio:
- I’m a machine learning engineer, and for the last 3 years I have worked in a startup on different aspects of neural networks dedicated for computer vision. Before that I was a project manager and head of algorithms and communication group in another organization.
- I have a master’s degree in astrophysics
- …And a passion for photography, painting, and everything visual. You can see some of my work here and here.
- I have experience giving technical talks to both technical and non-technical audiences, and I like it.
- I love it when things are easy to understand, so I decided to take what I do every day and make it accessible and interesting to people either from my field or without experience.
In this and the following chapters, we will explore a family of artificial intelligence algorithms known as neural networks (another name is deep learning) — their origins, how they work, what they are good for, and why they are good at what they do. In the future we will explore more technically advanced stuff but for now I prefer to keep it more basic. I will try to keep the technical discussion suitable for everyone with a little technical background, but if you have a question or are unclear about anything — don’t hesitate to write me!
Before diving into technical details, this chapter will be about basic concepts and tasks. Neural networks (or NNs) are a type of algorithm capable of computing complex tasks, formerly considered as requiring high cognitive skills and beyond the ability of a computer (hence the term ‘artificial intelligence’, or AI) The term ‘neural network’ was inspired by the resemblance of the algorithmic structure, to neurons in the brain. I want to be clear about this: there is some resemblance between the two — but it’s rather shallow. Beyond the basic concept (which will be discussed later), the field of NN has evolved into architectures and mechanisms that have very little, if anything, in common with our biological brain.
Neural networks have two operation modes or phases — training (before deployment, learning to perform a task), and inference (being deployed to perform the task). The training phase requires data. Usually, a lot of data. In the last two decades, the abundance of both computational power and accessible big data, enabled great achievements and are the main reason that neural networks have become a very hot topic.
The basic concept of algorithmic NNs dates as far back as 1943, where McCulloch and Pitts introduced a mechanism composed of several adjacent simple computation units whose combined work solves a complicated equation. The connections between the simple units and the requirement that those units ‘fire’ their results simultaneously inspired them to call this mechanism ‘nerve net’. Our next stop is the development of a binary classifier called ‘perceptron’ (which will be presented soon) in 1973. An extremely important mechanism, introduced in 1986, is the back-propagation — it is the heart of the neural netwok’s ability to learn.
Typical tasks that NNs can handle are:
- Object Classification — One of the most basic tasks in NN computer vision — given an image with a single object — return the class of the object (e.g. person, cat, bicycle). Similar tasks such as character/letter/digit classification are also under this category.
- Object Detection in images — given an image with (potentially) more than one object — not only classifying the objects, but also localizing them in the image i.e. plotting boxes around them.
- Object Segmentation (a finer localization— here the algorithm actually classifies each pixel in the image according to which object it belongs to, giving tighter boundaries around objects).
- Image Colorization — one of my favorites (guess why)— a task in a field called Image Enhancement — start with an image with corrupted color (e.g. old b/w, or sepia, or those bluish underwater photos) — and paint it natural-looking colors. For some fantastic examples, go here.
- Super Resolution —another one of my favorites —also an Image Enhancement task — start with a low-resolution image, and expand it (e.g. from 480x640 pixels to 1920x2560 pixels), while keeping the image sharp (when you just expand the dimensions of an image without sophistication, you get a large image which is blurry or pixelated).
There are many other tasks that use NNs, but that’s enough for now.
I should mention that computer vision (CV) as a field has existed long before neural networks became popular. But the advances in NNs, which perform many CV tasks at accuracy comparable with a human’s but much faster, revolutionized the field. The natural thing to do was to look for tasks that were previously (or are currently) done by humans and see if we can offload them to computers running neural networks.
Autonomous vehicles, and their more common and realistic counterpart — ADAS (Advanced Driver Assistance Systems) are among the systems requiring automatized, accurate and fast computer vision skills such as object detection. Forward collision alert, pedestrian and cyclist tracking, lane keep assist, drowsiness detection are already a reality in modern cars and are achieved in many cases with forward-facing (or driver-facing) cameras feeding a computer running neural networks.
Medical Imaging is another field in which Classification, Object detection and segmentation tasks that are in high demand. This excellent paper gives a good overview on the subject. In short, the same principles that allow a neural network to detect a road sign, are used to detect certain features in a medical image corresponding to a bone fracture, a tumor etc. As in autonomous vehicles vs. ADAS — we are still not at a point where we blindly trust a computer algorithm to do everything — but these systems take a significant load off the human experts and allow them to do more in the same amount of time.
Face recognition, a task with many use-cases — from tagging your drunk friends in a photo from last night’s Halloween party on FB, to biometric gates at an airport — also use various methods of AI, among which are neural networks which are closely related to classification tasks.
Automated production lines, manufacturing thousands of units per hour, can and do benefit from computer vision systems that guide robotic arms to handle products or detect flaws.
Image Enhancement tasks, such as super resolution (SR) and denoising are required not only for visual aesthetics considerations (such as improving the quality of photos you took while on a trip to Iceland in bad conditions or when the camera was accidentally set on low resolution), but also for very practical (and potentially lucrative) needs such as:
- Video streaming — giving your viewers at home a 4K resolution viewing experience even though your media or cable infrastructure support only a lower resolution.
- Consumer product camera improvement — Doing more with less — An in-product-post-processing software giving your smartphone user a better photo quality using cheaper (=more noisy and/or less resolution) camera sensors and optics.
- These tasks can also be used as a pre-processing stage in modular systems for other tasks (for example — increasing the resolution of a medical image to improve the classification task performance, or the resolution of a small crop containing a human face image, for a face-recognition system).
Hopefully this short intro gives a sufficient understanding to the origins of NNs and the motivation to use them for various automation purposes.
In the next chapter we will explore the structure of a basic neural network.