Affine recently completed 6 years, I have been a part of it for about 3 of those years. As an analytics firm, the most common business problem that we have come across is that of forecasting consumer demand. This is particularly true for Retail and CPG clients.
Over the last few years have dealt with simple forecasting problems for which we can use very simple time-series forecasting techniques like ARIMA and ARIMAX or even linear regression these are forecasts which are more at an organization or for specific business divisions. But over the years we have seen a distinct shift in focus of all our clients to get forecasts at a more granular level, sometimes for even specific items. These forecasts are difficult to attain using simple techniques. This is where more sophisticated techniques come into play. These techniques are the more complex machine learning techniques which include RF, XG Boost etc.
The loose definition of data science is to analyze data of a business, to be able to produce actionable insights and recommendations for the business. The simplicity or the complexity of the analysis, aka the level of “Data Science Sophistication” also impacts the quality and accuracy of results. The sophistication is essentially a function of 3 main data science components – technological skills, math/stats skills and the necessary business acumen to define and deliver a relevant business solution. These 3 pillars have very much been the mainstay of data science ever since it started getting embraced by the businesses over the past two decades and should continue to be even in the future. What, however, has changed or will change in the future is the underlying R&D in the areas of technology and statistical techniques. I have not witnessed many other industries where these skills are becoming obsolete at such fast rate. Data Science is unique in its requirement of the data scientist and the consulting firms to constantly update their skills and be very futuristic in adopting new and upcoming skills. This article is an attempt to look at how the tool/tech aspects of data science have evolved over the past few decades, and more importantly what the future holds for this fascinating tech and innovation driven field.
In a world of extreme competition, expense reduction being the mantra for most organizations, primarily in the retail and CPG industries, they try to focus on cost cutting and maintaining optimum levels of inventory to gain the competitive edge. To accomplish this, forecasting demand is of utmost importance. It is also not enough to have a macro level sales forecast for the entire organization.
Efficient and accurate demand forecast enables organizations, to anticipate demand and consequently allocate the optimal amount of resources to minimize stagnant inventory. This will result in negligible wastage of resources as well as reduction of costs such as storage cost, transportation cost etc. Another side effect of accurate demand prediction is the prevention of shrinkage so that firms don’t have to give huge discounts to clear stock.
This excerpt will touch upon the steps in demand forecasting and briefly, talk about the different demand forecasting methods. The article ends with some challenges of demand forecasting.
The above photo is not created by a specialized app or photoshop. It was generated by a Deep learning algorithm which uses convolutional networks to learn artistic features from various paintings and changes any photo depicting how an artist would have painted it.
Convolutional Neural Networks has become part of every state of the art solutions in areas like
Self-driving cars in identifying pedestrians, objects.
Natural Language Processing.
A few days back Google surprised me with a video called Smiles 2016 where all the photos of 2016 where I was partying with family, friends, colleagues are put together. It was a collection of photos where everyone in the photo was smiling. Emotion recognition. We will discuss a couple of Deep learning architectures that powers these applications in this blog.
Before we dive into CNN lets try to understand why not Feed Forward Neural network. According to universality theorem which we discussed in the previous blog, any network will be able to approximate a function just by adding Neurons(Functions), but there are no guarantees in time when will it reach the optimal solution. Feed Forward neural networks tend to flatten images to a flat vector thus losing all the spatial information that comes with an Image. So for problems where spatial feature importance is high CNN tend to achieve higher accuracy in a very shorter time compared to Feed-Forward Neural Networks.
Before we dive into what a Convolutional Neural Network is letting get comfortable with nuts and bolts which form it.
Before we dive into CNN lets take a look at how a computer looks at an image.
What we see
What a computer sees
Wow, it’s great to know that computer sees images, videos as a matrix of numbers. A common way of looking at an image in computer vision is a matrix of dimensions Width * Height * Channels. Where Channels are Red, Green, Blue and sometimes alpha is also part of channels.
Filters are a small matrix of numbers usually of size 3*3*3 (width, height, channel) or 7*7*3. Filters perform various operations like blur, sharpen, outline on a given image. Historically these filters are carefully hand picked to gain various features of an image. In our case, CNN creates these filters automatically using a combination of techniques like Gradient descent and Backpropagation. Filters are moved across an image starting from top left to the bottom right to capture all the essential features. They are also called as kernels in Neural networks.
In a convolutional layer, we convolve the filter with patches across an image. For example on the left-hand side of the below image is a matrix representation of a dummy image and the middle layer is the filter or kernel. The right side of the image has the output of convolution layer. Look at the formula in the image to understand how the kernel and a part of the image are combined together to form a new pixel.
Data science helps us to extract knowledge or insights from data- either structured or unstructured- by using scientific methods like mathematical or statistical models. In the last two decades, it has been one of the most popular fields with the rise of all big data technologies. A lot of companies have been using recommendation engines to promote their products/suggestions in accordance with users’ interests such as Amazon, Netflix, Google Play. A lot of other applications like image recognition, gaming, or Airline route planning also involves the usage of big data and data science.
Sports is another field which is using data science extensively to improve strategies and predicting match outcomes. Cricket is a sport where machine learning has scope to dive into quite a large outfield. It can go a long way towards suggesting optimal strategies for a team to win a match or a franchise to bid a valuable player.
Recently, when I was reading up on Cyber Security & Threat Detection, I came across “The Annual Data Breach Report by Verizon”. The report analyzed thousands of such incidents reported by various companies, public & private organizations which happened over the last couple of years. The report analyzed breaches by firmographics, geographies, industries etc. and found that cyber intrusion is a growing threat to every industry based in every country of the world. The report proves time and again that “No single industry or organization in the world is safe from Cyber Threats”. This piqued my curiosity & we felt that we could use all the goodness of data science to effectively tackle this problem. I designed a Threat/Intrusion Detection System, which could be used to detect such data leaks/breaches & take a preventive action to contain, if not stop the damage due to breach.
Traditional Machine Learning had used handwritten features and modality-specific machine learning to classify images, text or recognize voices. Deep learning / Neural network identifies features and finds different patterns automatically. Time to build these complex tasks has been drastically reduced and accuracy has exponentially increased because of advancements in Deep learning. Neural networks have been partly inspired from how 86 billion neurons work in a human and become more of a mathematical and a computer problem. We will see by the end of the blog how neural networks can be intuitively understood and implemented as a set of matrix multiplications, cost function, and optimization algorithms.