"주성분 분석"의 두 판 사이의 차이
둘러보기로 가기
검색하러 가기
imported>Pythagoras0 |
Pythagoras0 (토론 | 기여) |
||
(같은 사용자의 중간 판 2개는 보이지 않습니다) | |||
7번째 줄: | 7번째 줄: | ||
* https://mathematica.stackexchange.com/questions/50987/principal-components-how-to-obtain-linear-transformations | * https://mathematica.stackexchange.com/questions/50987/principal-components-how-to-obtain-linear-transformations | ||
* https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues | * https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues | ||
− | |||
− | |||
− | |||
==computational resource== | ==computational resource== | ||
15번째 줄: | 12번째 줄: | ||
* https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html | * https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html | ||
+ | |||
+ | ==관련된 항목들== | ||
+ | * [[특이값 분해]] | ||
+ | |||
+ | |||
+ | |||
+ | == 노트 == | ||
+ | |||
+ | * The first step in PCA is to draw a new axis representing the direction of maximum variation through the data.<ref name="ref_9dbe">[https://www.nonlinear.com/support/progenesis/lc-ms/faq/v4.1/pca.aspx What does Principal Component Analysis (PCA) show?]</ref> | ||
+ | * This is because a significant feature is one which exhibits differences between groups, and PCA captures differences between groups.<ref name="ref_9dbe" /> | ||
+ | * Therefore, using significant features for the PCA will always see some sort of grouping.<ref name="ref_9dbe" /> | ||
+ | * This is simply because PCA captures the variation that exists in the feature data and you have chosen all features.<ref name="ref_9dbe" /> | ||
+ | * Principal Component Analysis and Factor Analysis are data reduction methods to re-express multivariate data with fewer dimensions.<ref name="ref_42b5">[https://sites.google.com/site/econometricsacademy/econometrics-models/principal-component-analysis Principal Component Analysis]</ref> | ||
+ | * PCA is closely related to the Karhunen-Loève (KL) expansion.<ref name="ref_9b0b">[https://www.futurelearn.com/info/courses/statistical-shape-modelling/0/steps/16876 Principal Component Analysis]</ref> | ||
+ | * PCA, the eigenvectors \(\vec{\varphi}_i\) of the covariance matrix \(\Sigma\) are usually referred to as principal components or eigenmodes.<ref name="ref_9b0b" /> | ||
+ | * Please note that PCA is sensitive to the relative scaling of the original attributes.<ref name="ref_b10f">[https://docs.rapidminer.com/latest/studio/operators/cleansing/dimensionality_reduction/principal_component_analysis.html Principal Component Analysis]</ref> | ||
+ | * In this chapter, we describe the basic idea of PCA and, demonstrate how to compute and visualize PCA using R software.<ref name="ref_20a5">[http://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/112-pca-principal-component-analysis-essentials/ Principal Component Analysis Essentials]</ref> | ||
+ | * Basics Understanding the details of PCA requires knowledge of linear algebra.<ref name="ref_20a5" /> | ||
+ | * PCA assumes that the directions with the largest variances are the most “important” (i.e, the most principal).<ref name="ref_20a5" /> | ||
+ | * Note that, the PCA method is particularly useful when the variables within the data set are highly correlated.<ref name="ref_20a5" /> | ||
+ | * XLSTAT provides a complete and flexible PCA feature to explore your data directly in Excel.<ref name="ref_b97e">[https://www.xlstat.com/en/solutions/features/principal-component-analysis-pca Principal Component Analysis (PCA)]</ref> | ||
+ | * PCA dimensions are also called axes or Factors.<ref name="ref_b97e" /> | ||
+ | * PCA can thus be considered as a Data Mining method as it allows to easily extract information from large datasets.<ref name="ref_b97e" /> | ||
+ | * XLSTAT lets you add variables (qualitative or quantitative) or observations to the PCA after it has been computed.<ref name="ref_b97e" /> | ||
+ | * The first edition of this book was the first comprehensive text written solely on principal component analysis.<ref name="ref_c7ac">[https://link.springer.com/book/10.1007/b98835 Principal Component Analysis]</ref> | ||
+ | * In order to achieve this, principal component analysis (PCA) was conducted on joint moment waveform data from the hip, knee and ankle.<ref name="ref_03f4">[https://www.frontiersin.org/articles/10.3389/fbioe.2019.00193/full Principal Component Analysis Reveals the Proximal to Distal Pattern in Vertical Jumping Is Governed by Two Functional Degrees of Freedom]</ref> | ||
+ | * PCA was also performed comparing all data from each individual across CMJnas and CMJas conditions.<ref name="ref_03f4" /> | ||
+ | * PCA was used in this study to extract common patterns of moment production during the vertical jump under two task constraints.<ref name="ref_03f4" /> | ||
+ | * In biomechanics, PCA has sometimes been used to compare time-normalized waveforms.<ref name="ref_03f4" /> | ||
+ | * Hence, PCA allows us to find the direction along which our data varies the most.<ref name="ref_0869">[https://docs.opencv.org/master/d1/dee/tutorial_introduction_to_pca.html OpenCV: Introduction to Principal Component Analysis (PCA)]</ref> | ||
+ | * Applying PCA to N-dimensional data set yields N N-dimensional eigenvectors, N eigenvalues and 1 N-dimensional center point.<ref name="ref_0869" /> | ||
+ | * A simple example is provided by comparing the singular spectrum from a singular value decomposition (SVD) with that of a traditional PCA.<ref name="ref_a46f">[https://biologydirect.biomedcentral.com/articles/10.1186/1745-6150-2-2 Component retention in principal component analysis with application to cDNA microarray data]</ref> | ||
+ | * Note the robustness of PCA.<ref name="ref_a46f" /> | ||
+ | * Components are then grouped into subspaces preserving the order determined by the maximum variance property of PCA.<ref name="ref_a46f" /> | ||
+ | * λ N represent the eigenvalues from a PCA of the data.<ref name="ref_a46f" /> | ||
+ | * Principal Component Analysis is an appropriate tool for removing the collinearity.<ref name="ref_be70">[https://www.originlab.com/doc/Tutorials/Principal-Component-Analysis Principal Component Analysis]</ref> | ||
+ | * Right-click on the tab of PCA Plot Data1 and select Duplicate.<ref name="ref_be70" /> | ||
+ | * The new sheet is named as PCA Plot Data2.<ref name="ref_be70" /> | ||
+ | * Because of the versatility and interpretability of PCA, it has been shown to be effective in a wide variety of contexts and disciplines.<ref name="ref_4bb4">[https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html In Depth: Principal Component Analysis]</ref> | ||
+ | * PCA's main weakness is that it tends to be highly affected by outliers in the data.<ref name="ref_4bb4" /> | ||
+ | * In the following sections, we will look at other unsupervised learning methods that build on some of the ideas of PCA.<ref name="ref_4bb4" /> | ||
+ | * Find the principal components for one data set and apply the PCA to another data set.<ref name="ref_4836">[https://www.mathworks.com/help/stats/pca.html Principal component analysis of raw data]</ref> | ||
+ | * For example, you can preprocess the training data set by using PCA and then train a model.<ref name="ref_4836" /> | ||
+ | * Use coeff (principal component coefficients) and mu (estimated means of XTrain ) to apply the PCA to a test data set.<ref name="ref_4836" /> | ||
+ | * To use the trained model for the test set, you need to transform the test data set by using the PCA obtained from the training data set.<ref name="ref_4836" /> | ||
+ | * The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999.<ref name="ref_586a">[http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html sklearn.decomposition.PCA — scikit-learn 0.23.2 documentation]</ref> | ||
+ | * Implements the probabilistic PCA model from: Tipping, M. E., and Bishop, C. M. (1999).<ref name="ref_586a" /> | ||
+ | * Some of the important differences and similarities between PCA and MLPCA are summarized in Table 2 and are briefly discussed here.<ref name="ref_e3a9">[https://www.sciencedirect.com/topics/medicine-and-dentistry/principal-component-analysis Principal Component Analysis - an overview]</ref> | ||
+ | * One of the most convenient features of PCA that is lost in the transition to MLPCA is the simultaneous estimation of all subspace models.<ref name="ref_e3a9" /> | ||
+ | * Of course, some properties of PCA remain the same for MLPCA.<ref name="ref_e3a9" /> | ||
+ | * In addition, the columns of U and V remain orthonormal for both PCA and MLPCA.<ref name="ref_e3a9" /> | ||
+ | * The PCA score plot of the first two PCs of a data set about food consumption profiles.<ref name="ref_5f8a">[https://blog.umetrics.com/what-is-principal-component-analysis-pca-and-how-it-is-used What is principal component analysis (PCA) and how it is used?]</ref> | ||
+ | * Principal Component Analysis is a dimension-reduction tool that can be used advantageously in such situations.<ref name="ref_9e21">[https://www.itl.nist.gov/div898/handbook/pmc/section5/pmc55.htm 6.5.5. Principal Components]</ref> | ||
+ | * The main idea behind principal component analysis is to derive a linear function \({\bf y}\) for each of the vector variables \({\bf z}_i\).<ref name="ref_9e21" /> | ||
+ | * But if we want to tease out variation, PCA finds a new coordinate system in which every point has a new (x,y) value.<ref name="ref_77c6">[https://setosa.io/ev/principal-component-analysis/ Principal Component Analysis explained visually]</ref> | ||
+ | * PCA is useful for eliminating dimensions.<ref name="ref_77c6" /> | ||
+ | * 3D example With three dimensions, PCA is more useful, because it's hard to see through a cloud of data.<ref name="ref_77c6" /> | ||
+ | * To see the "official" PCA transformation, click the "Show PCA" button.<ref name="ref_77c6" /> | ||
+ | * In this section we will start by visualizing the data as well as consider a simplified, geometric view of what a PCA model look like.<ref name="ref_b877">[https://learnche.org/pid/latent-variable-modelling/principal-component-analysis/index 6.5. Principal Component Analysis (PCA) — Process Improvement using Data]</ref> | ||
+ | * The PCA method starts with the "Road" class and computes the mean value for each attribute for that class.<ref name="ref_1007">[http://www.harrisgeospatial.com/docs/BackgroundPCA.html Principal Components Analysis Background]</ref> | ||
+ | * The PCA method computes class scores based on the training samples you select.<ref name="ref_1007" /> | ||
+ | * Intensive Principal Component Analysis Classical PCA takes a set of data examples and infers features which are linearly uncorrelated.<ref name="ref_731f">[https://www.pnas.org/content/116/28/13762 Visualizing probabilistic models and data with Intensive Principal Component Analysis]</ref> | ||
+ | * The features to be analyzed with PCA are compared via their Euclidean distance.<ref name="ref_731f" /> | ||
+ | * This arises because both InPCA and PCA/MDS rely on mean shifing the input data before finding an eigenbasis.<ref name="ref_731f" /> | ||
+ | * Thus, we view InPCA as a natural generalization of PCA to probability distributions and MDS to non-Euclidean embeddings.<ref name="ref_731f" /> | ||
+ | * As an added benefit, each of the “new” variables after PCA are all independent of one another.<ref name="ref_6c83">[https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c A One-Stop Shop for Principal Component Analysis]</ref> | ||
+ | * If you answered “yes” to all three questions, then PCA is a good method to use.<ref name="ref_6c83" /> | ||
+ | * Our original data transformed by PCA.<ref name="ref_6c83" /> | ||
+ | * Here, I walk through an algorithm for conducting PCA.<ref name="ref_6c83" /> | ||
+ | * PCA is used in exploratory data analysis and for making predictive models.<ref name="ref_955d">[https://en.wikipedia.org/wiki/Principal_component_analysis Principal component analysis]</ref> | ||
+ | * PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to factor analysis.<ref name="ref_955d" /> | ||
+ | * PCA can be thought of as fitting a p-dimensional ellipsoid to the data, where each axis of the ellipsoid represents a principal component.<ref name="ref_955d" /> | ||
+ | * PCA essentially rotates the set of points around their mean in order to align with the principal components.<ref name="ref_955d" /> | ||
+ | * This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do.<ref name="ref_1796">[https://royalsocietypublishing.org/doi/10.1098/rsta.2015.0202 Principal component analysis: a review and recent developments]</ref> | ||
+ | * Many techniques have been developed for this purpose, but principal component analysis (PCA) is one of the oldest and most widely used.<ref name="ref_1796" /> | ||
+ | * PCA can be based on either the covariance matrix or the correlation matrix.<ref name="ref_1796" /> | ||
+ | * Section 3c discusses one of the extensions of PCA that has been most active in recent years, namely robust PCA (RPCA).<ref name="ref_1796" /> | ||
+ | ===소스=== | ||
+ | <references /> | ||
[[분류:계산]] | [[분류:계산]] | ||
[[분류:migrate]] | [[분류:migrate]] |
2020년 12월 22일 (화) 03:01 기준 최신판
introduction
- The principal components of matrix are linear transformations of the original columns into uncorrelated columns arranged in order of decreasing variance
memo
- https://math.stackexchange.com/questions/3869/what-is-the-intuitive-relationship-between-svd-and-pca
- https://mathematica.stackexchange.com/questions/50987/principal-components-how-to-obtain-linear-transformations
- https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
computational resource
- https://drive.google.com/file/d/0B8XXo8Tve1cxT0hBUmdPLUd1VHM/view
- https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html
관련된 항목들
노트
- The first step in PCA is to draw a new axis representing the direction of maximum variation through the data.[1]
- This is because a significant feature is one which exhibits differences between groups, and PCA captures differences between groups.[1]
- Therefore, using significant features for the PCA will always see some sort of grouping.[1]
- This is simply because PCA captures the variation that exists in the feature data and you have chosen all features.[1]
- Principal Component Analysis and Factor Analysis are data reduction methods to re-express multivariate data with fewer dimensions.[2]
- PCA is closely related to the Karhunen-Loève (KL) expansion.[3]
- PCA, the eigenvectors \(\vec{\varphi}_i\) of the covariance matrix \(\Sigma\) are usually referred to as principal components or eigenmodes.[3]
- Please note that PCA is sensitive to the relative scaling of the original attributes.[4]
- In this chapter, we describe the basic idea of PCA and, demonstrate how to compute and visualize PCA using R software.[5]
- Basics Understanding the details of PCA requires knowledge of linear algebra.[5]
- PCA assumes that the directions with the largest variances are the most “important” (i.e, the most principal).[5]
- Note that, the PCA method is particularly useful when the variables within the data set are highly correlated.[5]
- XLSTAT provides a complete and flexible PCA feature to explore your data directly in Excel.[6]
- PCA dimensions are also called axes or Factors.[6]
- PCA can thus be considered as a Data Mining method as it allows to easily extract information from large datasets.[6]
- XLSTAT lets you add variables (qualitative or quantitative) or observations to the PCA after it has been computed.[6]
- The first edition of this book was the first comprehensive text written solely on principal component analysis.[7]
- In order to achieve this, principal component analysis (PCA) was conducted on joint moment waveform data from the hip, knee and ankle.[8]
- PCA was also performed comparing all data from each individual across CMJnas and CMJas conditions.[8]
- PCA was used in this study to extract common patterns of moment production during the vertical jump under two task constraints.[8]
- In biomechanics, PCA has sometimes been used to compare time-normalized waveforms.[8]
- Hence, PCA allows us to find the direction along which our data varies the most.[9]
- Applying PCA to N-dimensional data set yields N N-dimensional eigenvectors, N eigenvalues and 1 N-dimensional center point.[9]
- A simple example is provided by comparing the singular spectrum from a singular value decomposition (SVD) with that of a traditional PCA.[10]
- Note the robustness of PCA.[10]
- Components are then grouped into subspaces preserving the order determined by the maximum variance property of PCA.[10]
- λ N represent the eigenvalues from a PCA of the data.[10]
- Principal Component Analysis is an appropriate tool for removing the collinearity.[11]
- Right-click on the tab of PCA Plot Data1 and select Duplicate.[11]
- The new sheet is named as PCA Plot Data2.[11]
- Because of the versatility and interpretability of PCA, it has been shown to be effective in a wide variety of contexts and disciplines.[12]
- PCA's main weakness is that it tends to be highly affected by outliers in the data.[12]
- In the following sections, we will look at other unsupervised learning methods that build on some of the ideas of PCA.[12]
- Find the principal components for one data set and apply the PCA to another data set.[13]
- For example, you can preprocess the training data set by using PCA and then train a model.[13]
- Use coeff (principal component coefficients) and mu (estimated means of XTrain ) to apply the PCA to a test data set.[13]
- To use the trained model for the test set, you need to transform the test data set by using the PCA obtained from the training data set.[13]
- The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999.[14]
- Implements the probabilistic PCA model from: Tipping, M. E., and Bishop, C. M. (1999).[14]
- Some of the important differences and similarities between PCA and MLPCA are summarized in Table 2 and are briefly discussed here.[15]
- One of the most convenient features of PCA that is lost in the transition to MLPCA is the simultaneous estimation of all subspace models.[15]
- Of course, some properties of PCA remain the same for MLPCA.[15]
- In addition, the columns of U and V remain orthonormal for both PCA and MLPCA.[15]
- The PCA score plot of the first two PCs of a data set about food consumption profiles.[16]
- Principal Component Analysis is a dimension-reduction tool that can be used advantageously in such situations.[17]
- The main idea behind principal component analysis is to derive a linear function \({\bf y}\) for each of the vector variables \({\bf z}_i\).[17]
- But if we want to tease out variation, PCA finds a new coordinate system in which every point has a new (x,y) value.[18]
- PCA is useful for eliminating dimensions.[18]
- 3D example With three dimensions, PCA is more useful, because it's hard to see through a cloud of data.[18]
- To see the "official" PCA transformation, click the "Show PCA" button.[18]
- In this section we will start by visualizing the data as well as consider a simplified, geometric view of what a PCA model look like.[19]
- The PCA method starts with the "Road" class and computes the mean value for each attribute for that class.[20]
- The PCA method computes class scores based on the training samples you select.[20]
- Intensive Principal Component Analysis Classical PCA takes a set of data examples and infers features which are linearly uncorrelated.[21]
- The features to be analyzed with PCA are compared via their Euclidean distance.[21]
- This arises because both InPCA and PCA/MDS rely on mean shifing the input data before finding an eigenbasis.[21]
- Thus, we view InPCA as a natural generalization of PCA to probability distributions and MDS to non-Euclidean embeddings.[21]
- As an added benefit, each of the “new” variables after PCA are all independent of one another.[22]
- If you answered “yes” to all three questions, then PCA is a good method to use.[22]
- Our original data transformed by PCA.[22]
- Here, I walk through an algorithm for conducting PCA.[22]
- PCA is used in exploratory data analysis and for making predictive models.[23]
- PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to factor analysis.[23]
- PCA can be thought of as fitting a p-dimensional ellipsoid to the data, where each axis of the ellipsoid represents a principal component.[23]
- PCA essentially rotates the set of points around their mean in order to align with the principal components.[23]
- This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do.[24]
- Many techniques have been developed for this purpose, but principal component analysis (PCA) is one of the oldest and most widely used.[24]
- PCA can be based on either the covariance matrix or the correlation matrix.[24]
- Section 3c discusses one of the extensions of PCA that has been most active in recent years, namely robust PCA (RPCA).[24]
소스
- ↑ 1.0 1.1 1.2 1.3 What does Principal Component Analysis (PCA) show?
- ↑ Principal Component Analysis
- ↑ 3.0 3.1 Principal Component Analysis
- ↑ Principal Component Analysis
- ↑ 5.0 5.1 5.2 5.3 Principal Component Analysis Essentials
- ↑ 6.0 6.1 6.2 6.3 Principal Component Analysis (PCA)
- ↑ Principal Component Analysis
- ↑ 8.0 8.1 8.2 8.3 Principal Component Analysis Reveals the Proximal to Distal Pattern in Vertical Jumping Is Governed by Two Functional Degrees of Freedom
- ↑ 9.0 9.1 OpenCV: Introduction to Principal Component Analysis (PCA)
- ↑ 10.0 10.1 10.2 10.3 Component retention in principal component analysis with application to cDNA microarray data
- ↑ 11.0 11.1 11.2 Principal Component Analysis
- ↑ 12.0 12.1 12.2 In Depth: Principal Component Analysis
- ↑ 13.0 13.1 13.2 13.3 Principal component analysis of raw data
- ↑ 14.0 14.1 sklearn.decomposition.PCA — scikit-learn 0.23.2 documentation
- ↑ 15.0 15.1 15.2 15.3 Principal Component Analysis - an overview
- ↑ What is principal component analysis (PCA) and how it is used?
- ↑ 17.0 17.1 6.5.5. Principal Components
- ↑ 18.0 18.1 18.2 18.3 Principal Component Analysis explained visually
- ↑ 6.5. Principal Component Analysis (PCA) — Process Improvement using Data
- ↑ 20.0 20.1 Principal Components Analysis Background
- ↑ 21.0 21.1 21.2 21.3 Visualizing probabilistic models and data with Intensive Principal Component Analysis
- ↑ 22.0 22.1 22.2 22.3 A One-Stop Shop for Principal Component Analysis
- ↑ 23.0 23.1 23.2 23.3 Principal component analysis
- ↑ 24.0 24.1 24.2 24.3 Principal component analysis: a review and recent developments