# Tag: PCA

Learn how to use PCA algorithm to find variables that vary togetherContinue reading on Towards Data Science »

A step-by-step tutorial to explain the working of PCA and implementing it from scratch in pythonImage By AuthorIntroductionPrincipal Component Analysis or PCA is a commonly used dimensionality reduction method. It works by computing the principal components and performing a change of basis. It retains the data in the direction of maximum variance. The reduced features are uncorrelated with each other. These features can be used for unsupervised clustering and classification. To reduce…

## What do they tell us about our data?

I have learned about eigenvalues and eigenvectors in University in a linear algebra course. It was very dry and mathematical, so I did not get, what it is all about. But I want to present this topic to you in a more intuitive way and I will use many animations to illustrate it.

First, we will look at how applying a matrix to a vector rotates and scales a vector. This will show us what eigenvalues and eigenvectors are. Then we will learn about principal components and that they are the eigenvectors of the covariance matrix. This knowledge will help us understand our final topic, principal component analysis.

To understand eigenvalues and eigenvectors, we have to first take a look at matrix multiplication.

## Clustering types and their usage areas are explained with python implementation

Unlabeled datasets can be grouped by considering their similar properties with the unsupervised learning technique. However, the point of view of these similar features is different in each algorithm. Unsupervised learning provides detailed information about the dataset as well as labeling the data.

## The theoretical and practical part of Principal Component Analysis with python implementation

This article covers the definition of PCA, the Python implementation of the theoretical part of the PCA without Sklearn library, the difference between PCA and feature selection & feature extraction, the implementation of machine learning & deep learning, and explained PCA types with an example.

By Aaron Wang, Master of Business Analytics @ MIT | Data Science.

This Data Science cheat sheet covers over a semester of introductory machine learning and is based on MIT’s Machine Learning courses 6.867 and 15.072. You should have at least a basic understanding of statistics and linear algebra, although beginners may still find this resource helpful.

Inspired by Maverick’s Data Science Cheatsheet (hence the 2.0 in the name), located here.

Topics covered:

• Linear and Logistic Regression
• Decision Trees and Random Forest
• SVM
• K-Nearest Neighbors
• Clustering
• Boosting
• Dimension Reduction (PCA, LDA, Factor Analysis)
• Natural Language Processing
• Neural Networks
• Recommender Systems
• Reinforcement Learning
• Anomaly Detection
• Time Series
• A/B Testing

This cheat sheet will be occasionally updated with new and improved info, so consider a follow or star in the GitHub repo to stay up to date.

The Hyperspectral data expands the capability of Image Classification. The Hyperspectral Data not only distinguishes different land cover types but it also provides the detailed characteristics of each land cover such as minerals, soil, man-made structures (buildings, roads, etc.) and vegetation types.

While dealing with the HyperSpectral data one disadvantage is that there are too many bands to process. Apart from that, it is a challenge to store such a large amount of data. With a large amount of data, the time complexity also increases.

Thus, it becomes crucial to either decrease the amount of data or to select only the relevant bands. It should be kept in mind that the classification quality should not degrade with the reduction in number of bands.

As you can see, Isomap is an Unsupervised Machine Learning technique aimed at Dimensionality Reduction.

It differs from a few other techniques in the same category by using a non-linear approach to dimensionality reduction instead of linear mappings used by algorithms such as PCA. We will see how linear vs. non-linear approaches differ in the next section.

## How does Isometric Mapping (Isomap) work?

Isomap is a technique that combines several different algorithms, enabling it to use a non-linear way to reduce dimensions while preserving local structures.

Before we look at the example of Isomap and compare it to a linear method of Principal Components Analysis (PCA), let’s list the high-level steps that Isomap performs:

1. Use a KNN approach to find the k nearest neighbors of every data point.

As we are moving towards the digital world — cybersecurity is becoming a crucial part of our life. When we talk about security in digital life then the main challenge is to find the abnormal activity.

When we make any transaction while purchasing any product online — a good amount of people prefer credit cards. The credit limit in credit cards sometimes helps us me making purchases even if we don’t have the amount at that time. but, on the other hand, these features are misused by cyber attackers.

To tackle this problem we need a system that can abort the transaction if it finds fishy.

Here, comes the need for a system that can track the pattern of all the transactions and if any pattern is abnormal then the transaction should be aborted.