Do you know AI

Posts

Do you know Large Language Models?

Large Language Models or LLMs seem very new as these terms came to popularity with ChatGPT (2022) but for a fact have always existed (since 1948 Claude Shannon's paper "A Mathematical theory of Communications"). Although at that time the Language models were not large, the main idea is still the same. The "A Mathematical theory of Communications" introduced a method of n-grams in which a probability for the next possible word, from a vocabulary list, was calculated based on the neighbouring 'n' words in a sentence (hence n-gram). It is based on the notion that the context (neighbouring words) has some relation to the main word in consideration. If you are aware of Convolutional Neural Networks they have the same idea but in terms of image pixels. In the 1980's a paper from Mathematical biophysics introduced the concept of Recurrent Neural Networks (RNNs) in which a series of mathematical neurons connected sequentially predict an output for a sequent...

Do you know the Math for Machine Learning?

There are two different levels of roles in the field of Machine Learning, the ML Engineer and the Research Scientist. Both these roles are different in essence as the Engineer is the one who implements the Algorithms in the real-world tasks, which is not too different than a regular software Engineer. The only difference is that ML algorithms require a lot of Hyper-parameter tuning. Success requires working knowledge of ML algorithms and a fair amount of Mathematical intuition behind them. The Research Scientist is the one to design the Algorithm. And they need extensive math knowledge to not only understand every piece of existing Algorithms but also to develop newer ones. And the only thing that keeps them different from Mathematicians is that the math required is not in its entirety. Following are the high-level topics in Mathematics required for each of these roles. For Engineers: Linear Algebra Probability and Statistics Basic C...

Do you know steps in building a full Machine Learning model?

1. Data Collection In Machine Learning the data is the most important thing, unlike humans who look at a person's face a few times and recognize him/her, ML needs tons of data. The 2001's paper from Microsoft showed that moderate and complex models performed almost the same given sufficient data. Apart from it, the quality of data is also important, data that does not represent appropriate relation between features and their label is of no use. 2. Data Preprocessing The preprocessing of data is very essential before feeding it to the algorithm, removing irrelevant features, merging highly correlated features, removing or manually adding missing values, and converting data to numeric values, suppose the data contains a feature representing the country and your dataset consists of many countries which might be moderately correlated to your output so you might not wanna remove it, you can convert it into a one-hot encoding (a zero vector of length equal to the number of c...

Do you know K-Nearest Neighbors?

The K-Nearest Neighbor is an Instance-based algorithm. You won't build an ML model with trainable variables in this but just compare an inference with the already known data for prediction. The algorithm is fitted with labeled training data (having many features), then when a prediction is queried it compares all the features of the given data point with already known data or training set. Then total 'K' most similar or closest (numerical distance) points are returned. The predicted class is of the majority among the returned K data points. KNN can be implemented in python using a library called Scikit Learn . It contains a built-in function for this. How do you decide the value of K? Does it matter? The value of K will decide how well your model fits the data. If the value is very low, say 1, then the predicted class will be of the closest data point. And even very close data points might get a different class output. This means your model is overfitted or has a very hig...

Do you know Overfitting?

While training Machine Learning models we often encounter the problem of overfitting. It simply means that our model has learned the mapping of training data from features to the target values. While our training loss might be very low (can go up to 0 if overfitted) but the model will not perform well on the test data (the unseen data). The reason is that the model has only memorized the training data and learned nothing of value. Why does overfitting occur? Insufficient Data: If your training data isn't enough, your model will only be trained on patterns in the small given dataset which may not occur in test data and/or inference. Model is too complex: If you are using a more complex model than required then the model will memorize noise in the data and that might not be present in test/inference as it is in the training set. Overtraining the model: If your validation loss starts going up again and you continue to train it, it will overfit How do you avoid/fix overfitting? Increa...

Do you know Reinforcement Learning?

Reinforcement Learning is different from the other 2 kinds as here an agent interacts with the environment. The agent initially selects action by random. But after performing action it receives positive or negative reinforcement (reward or punishment) and ultimately the agent tries to select those actions which will earn it most rewards. The agent selects action based on a policy. A policy is just a probability function that takes in the state or the observed environment as input and gives a probability for each valid action (Summing to one). As the agent learns or observes the relation between state and action and rewards it updates its policy function. There are 2 kinds of value functions: State-value function and Actions-value function. The state-value function takes the state as input and gives value or expectation of reward associated with the state. The action-value function takes state and action as input and gives value or expected reward of taking that actio...

Do you know Supervised and Unsupervised Learning?

Supervised Learning. The training data fed to the algorithm has the actual outputs associated with them (labels). A simple example of this is email spam detection. The algorithm is trained with many example emails along with their class label and it must learn how to classify new emails. This was a typical task of classification, another task for which the target values are continuous numerical is called regression. A set of features is fed into the algorithm (X) and actual outputs are numerical values (Y). Some of the supervised learning-based algorithms are: Linear - Regression Logistic Regression K-Nearest Neighbors Decision Trees Random Forest Neural Networks Support Vector Machines Un-Supervised Learning Here the training data is not labeled. Suppose you want to segregate customer groups or find out similarity among groups of data which might not be expected. Another task is of Dimensionality Reduction. If your dataset consists of many features then without losing signi...

Do you know Machine Learning?

Machine Learning is like Jesus, It's everywhere... From pizzerias to Notco (a company which uses AI to make vegan food that tastes like meat) and from banks to Netflix all are using Machine Learning. But can machines actually learn something? 🧐 There are several algorithms that improve performance on a particular task with experience, that's it. By the way, if anyone asked, that was the definition of Machine Learning. The thing that computer systems can actually increase their performance or learn tasks is what AI is driven by. Machine Learning is basically divided into 3 categories, viz, Supervised Learning, Unsupervised Learning, and Reinforcement Learning. Supervised Learning is learning from a training set of labeled examples provided by a knowledgeable external supervisor. Each example is a description of a situation together with a specification—the label—of the correct action the system should take to that situation, which is often to ident...