"주성분 분석"의 두 판 사이의 차이

수학노트
둘러보기로 가기 검색하러 가기
imported>Pythagoras0
 
(사용자 2명의 중간 판 8개는 보이지 않습니다)
1번째 줄: 1번째 줄:
==related items==
+
==introduction==
* [[Singular value decomposition]]
+
* The principal components of matrix are linear transformations of the original columns into uncorrelated columns arranged in order of decreasing variance
 +
 
 +
 
 +
==memo==
 +
* https://math.stackexchange.com/questions/3869/what-is-the-intuitive-relationship-between-svd-and-pca
 +
* https://mathematica.stackexchange.com/questions/50987/principal-components-how-to-obtain-linear-transformations
 +
* https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
  
 
==computational resource==
 
==computational resource==
 
* https://drive.google.com/file/d/0B8XXo8Tve1cxT0hBUmdPLUd1VHM/view
 
* https://drive.google.com/file/d/0B8XXo8Tve1cxT0hBUmdPLUd1VHM/view
 +
* https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html
 +
 +
 +
==관련된 항목들==
 +
* [[특이값 분해]]
 +
 +
 +
 +
== 노트 ==
 +
 +
* The first step in PCA is to draw a new axis representing the direction of maximum variation through the data.<ref name="ref_9dbe">[https://www.nonlinear.com/support/progenesis/lc-ms/faq/v4.1/pca.aspx What does Principal Component Analysis (PCA) show?]</ref>
 +
* This is because a significant feature is one which exhibits differences between groups, and PCA captures differences between groups.<ref name="ref_9dbe" />
 +
* Therefore, using significant features for the PCA will always see some sort of grouping.<ref name="ref_9dbe" />
 +
* This is simply because PCA captures the variation that exists in the feature data and you have chosen all features.<ref name="ref_9dbe" />
 +
* Principal Component Analysis and Factor Analysis are data reduction methods to re-express multivariate data with fewer dimensions.<ref name="ref_42b5">[https://sites.google.com/site/econometricsacademy/econometrics-models/principal-component-analysis Principal Component Analysis]</ref>
 +
* PCA is closely related to the Karhunen-Loève (KL) expansion.<ref name="ref_9b0b">[https://www.futurelearn.com/info/courses/statistical-shape-modelling/0/steps/16876 Principal Component Analysis]</ref>
 +
* PCA, the eigenvectors \(\vec{\varphi}_i\) of the covariance matrix \(\Sigma\) are usually referred to as principal components or eigenmodes.<ref name="ref_9b0b" />
 +
* Please note that PCA is sensitive to the relative scaling of the original attributes.<ref name="ref_b10f">[https://docs.rapidminer.com/latest/studio/operators/cleansing/dimensionality_reduction/principal_component_analysis.html Principal Component Analysis]</ref>
 +
* In this chapter, we describe the basic idea of PCA and, demonstrate how to compute and visualize PCA using R software.<ref name="ref_20a5">[http://www.sthda.com/english/articles/31-principal-component-methods-in-r-practical-guide/112-pca-principal-component-analysis-essentials/ Principal Component Analysis Essentials]</ref>
 +
* Basics Understanding the details of PCA requires knowledge of linear algebra.<ref name="ref_20a5" />
 +
* PCA assumes that the directions with the largest variances are the most “important” (i.e, the most principal).<ref name="ref_20a5" />
 +
* Note that, the PCA method is particularly useful when the variables within the data set are highly correlated.<ref name="ref_20a5" />
 +
* XLSTAT provides a complete and flexible PCA feature to explore your data directly in Excel.<ref name="ref_b97e">[https://www.xlstat.com/en/solutions/features/principal-component-analysis-pca Principal Component Analysis (PCA)]</ref>
 +
* PCA dimensions are also called axes or Factors.<ref name="ref_b97e" />
 +
* PCA can thus be considered as a Data Mining method as it allows to easily extract information from large datasets.<ref name="ref_b97e" />
 +
* XLSTAT lets you add variables (qualitative or quantitative) or observations to the PCA after it has been computed.<ref name="ref_b97e" />
 +
* The first edition of this book was the first comprehensive text written solely on principal component analysis.<ref name="ref_c7ac">[https://link.springer.com/book/10.1007/b98835 Principal Component Analysis]</ref>
 +
* In order to achieve this, principal component analysis (PCA) was conducted on joint moment waveform data from the hip, knee and ankle.<ref name="ref_03f4">[https://www.frontiersin.org/articles/10.3389/fbioe.2019.00193/full Principal Component Analysis Reveals the Proximal to Distal Pattern in Vertical Jumping Is Governed by Two Functional Degrees of Freedom]</ref>
 +
* PCA was also performed comparing all data from each individual across CMJnas and CMJas conditions.<ref name="ref_03f4" />
 +
* PCA was used in this study to extract common patterns of moment production during the vertical jump under two task constraints.<ref name="ref_03f4" />
 +
* In biomechanics, PCA has sometimes been used to compare time-normalized waveforms.<ref name="ref_03f4" />
 +
* Hence, PCA allows us to find the direction along which our data varies the most.<ref name="ref_0869">[https://docs.opencv.org/master/d1/dee/tutorial_introduction_to_pca.html OpenCV: Introduction to Principal Component Analysis (PCA)]</ref>
 +
* Applying PCA to N-dimensional data set yields N N-dimensional eigenvectors, N eigenvalues and 1 N-dimensional center point.<ref name="ref_0869" />
 +
* A simple example is provided by comparing the singular spectrum from a singular value decomposition (SVD) with that of a traditional PCA.<ref name="ref_a46f">[https://biologydirect.biomedcentral.com/articles/10.1186/1745-6150-2-2 Component retention in principal component analysis with application to cDNA microarray data]</ref>
 +
* Note the robustness of PCA.<ref name="ref_a46f" />
 +
* Components are then grouped into subspaces preserving the order determined by the maximum variance property of PCA.<ref name="ref_a46f" />
 +
* λ N represent the eigenvalues from a PCA of the data.<ref name="ref_a46f" />
 +
* Principal Component Analysis is an appropriate tool for removing the collinearity.<ref name="ref_be70">[https://www.originlab.com/doc/Tutorials/Principal-Component-Analysis Principal Component Analysis]</ref>
 +
* Right-click on the tab of PCA Plot Data1 and select Duplicate.<ref name="ref_be70" />
 +
* The new sheet is named as PCA Plot Data2.<ref name="ref_be70" />
 +
* Because of the versatility and interpretability of PCA, it has been shown to be effective in a wide variety of contexts and disciplines.<ref name="ref_4bb4">[https://jakevdp.github.io/PythonDataScienceHandbook/05.09-principal-component-analysis.html In Depth: Principal Component Analysis]</ref>
 +
* PCA's main weakness is that it tends to be highly affected by outliers in the data.<ref name="ref_4bb4" />
 +
* In the following sections, we will look at other unsupervised learning methods that build on some of the ideas of PCA.<ref name="ref_4bb4" />
 +
* Find the principal components for one data set and apply the PCA to another data set.<ref name="ref_4836">[https://www.mathworks.com/help/stats/pca.html Principal component analysis of raw data]</ref>
 +
* For example, you can preprocess the training data set by using PCA and then train a model.<ref name="ref_4836" />
 +
* Use coeff (principal component coefficients) and mu (estimated means of XTrain ) to apply the PCA to a test data set.<ref name="ref_4836" />
 +
* To use the trained model for the test set, you need to transform the test data set by using the PCA obtained from the training data set.<ref name="ref_4836" />
 +
* The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999.<ref name="ref_586a">[http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html sklearn.decomposition.PCA — scikit-learn 0.23.2 documentation]</ref>
 +
* Implements the probabilistic PCA model from: Tipping, M. E., and Bishop, C. M. (1999).<ref name="ref_586a" />
 +
* Some of the important differences and similarities between PCA and MLPCA are summarized in Table 2 and are briefly discussed here.<ref name="ref_e3a9">[https://www.sciencedirect.com/topics/medicine-and-dentistry/principal-component-analysis Principal Component Analysis - an overview]</ref>
 +
* One of the most convenient features of PCA that is lost in the transition to MLPCA is the simultaneous estimation of all subspace models.<ref name="ref_e3a9" />
 +
* Of course, some properties of PCA remain the same for MLPCA.<ref name="ref_e3a9" />
 +
* In addition, the columns of U and V remain orthonormal for both PCA and MLPCA.<ref name="ref_e3a9" />
 +
* The PCA score plot of the first two PCs of a data set about food consumption profiles.<ref name="ref_5f8a">[https://blog.umetrics.com/what-is-principal-component-analysis-pca-and-how-it-is-used What is principal component analysis (PCA) and how it is used?]</ref>
 +
* Principal Component Analysis is a dimension-reduction tool that can be used advantageously in such situations.<ref name="ref_9e21">[https://www.itl.nist.gov/div898/handbook/pmc/section5/pmc55.htm 6.5.5. Principal Components]</ref>
 +
* The main idea behind principal component analysis is to derive a linear function \({\bf y}\) for each of the vector variables \({\bf z}_i\).<ref name="ref_9e21" />
 +
* But if we want to tease out variation, PCA finds a new coordinate system in which every point has a new (x,y) value.<ref name="ref_77c6">[https://setosa.io/ev/principal-component-analysis/ Principal Component Analysis explained visually]</ref>
 +
* PCA is useful for eliminating dimensions.<ref name="ref_77c6" />
 +
* 3D example With three dimensions, PCA is more useful, because it's hard to see through a cloud of data.<ref name="ref_77c6" />
 +
* To see the "official" PCA transformation, click the "Show PCA" button.<ref name="ref_77c6" />
 +
* In this section we will start by visualizing the data as well as consider a simplified, geometric view of what a PCA model look like.<ref name="ref_b877">[https://learnche.org/pid/latent-variable-modelling/principal-component-analysis/index 6.5. Principal Component Analysis (PCA) — Process Improvement using Data]</ref>
 +
* The PCA method starts with the "Road" class and computes the mean value for each attribute for that class.<ref name="ref_1007">[http://www.harrisgeospatial.com/docs/BackgroundPCA.html Principal Components Analysis Background]</ref>
 +
* The PCA method computes class scores based on the training samples you select.<ref name="ref_1007" />
 +
* Intensive Principal Component Analysis Classical PCA takes a set of data examples and infers features which are linearly uncorrelated.<ref name="ref_731f">[https://www.pnas.org/content/116/28/13762 Visualizing probabilistic models and data with Intensive Principal Component Analysis]</ref>
 +
* The features to be analyzed with PCA are compared via their Euclidean distance.<ref name="ref_731f" />
 +
* This arises because both InPCA and PCA/MDS rely on mean shifing the input data before finding an eigenbasis.<ref name="ref_731f" />
 +
* Thus, we view InPCA as a natural generalization of PCA to probability distributions and MDS to non-Euclidean embeddings.<ref name="ref_731f" />
 +
* As an added benefit, each of the “new” variables after PCA are all independent of one another.<ref name="ref_6c83">[https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c A One-Stop Shop for Principal Component Analysis]</ref>
 +
* If you answered “yes” to all three questions, then PCA is a good method to use.<ref name="ref_6c83" />
 +
* Our original data transformed by PCA.<ref name="ref_6c83" />
 +
* Here, I walk through an algorithm for conducting PCA.<ref name="ref_6c83" />
 +
* PCA is used in exploratory data analysis and for making predictive models.<ref name="ref_955d">[https://en.wikipedia.org/wiki/Principal_component_analysis Principal component analysis]</ref>
 +
* PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to factor analysis.<ref name="ref_955d" />
 +
* PCA can be thought of as fitting a p-dimensional ellipsoid to the data, where each axis of the ellipsoid represents a principal component.<ref name="ref_955d" />
 +
* PCA essentially rotates the set of points around their mean in order to align with the principal components.<ref name="ref_955d" />
 +
* This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do.<ref name="ref_1796">[https://royalsocietypublishing.org/doi/10.1098/rsta.2015.0202 Principal component analysis: a review and recent developments]</ref>
 +
* Many techniques have been developed for this purpose, but principal component analysis (PCA) is one of the oldest and most widely used.<ref name="ref_1796" />
 +
* PCA can be based on either the covariance matrix or the correlation matrix.<ref name="ref_1796" />
 +
* Section 3c discusses one of the extensions of PCA that has been most active in recent years, namely robust PCA (RPCA).<ref name="ref_1796" />
 +
===소스===
 +
<references />
 +
  
 
[[분류:계산]]
 
[[분류:계산]]
 +
[[분류:migrate]]

2020년 12월 22일 (화) 03:01 기준 최신판

introduction

  • The principal components of matrix are linear transformations of the original columns into uncorrelated columns arranged in order of decreasing variance


memo

computational resource


관련된 항목들


노트

  • The first step in PCA is to draw a new axis representing the direction of maximum variation through the data.[1]
  • This is because a significant feature is one which exhibits differences between groups, and PCA captures differences between groups.[1]
  • Therefore, using significant features for the PCA will always see some sort of grouping.[1]
  • This is simply because PCA captures the variation that exists in the feature data and you have chosen all features.[1]
  • Principal Component Analysis and Factor Analysis are data reduction methods to re-express multivariate data with fewer dimensions.[2]
  • PCA is closely related to the Karhunen-Loève (KL) expansion.[3]
  • PCA, the eigenvectors \(\vec{\varphi}_i\) of the covariance matrix \(\Sigma\) are usually referred to as principal components or eigenmodes.[3]
  • Please note that PCA is sensitive to the relative scaling of the original attributes.[4]
  • In this chapter, we describe the basic idea of PCA and, demonstrate how to compute and visualize PCA using R software.[5]
  • Basics Understanding the details of PCA requires knowledge of linear algebra.[5]
  • PCA assumes that the directions with the largest variances are the most “important” (i.e, the most principal).[5]
  • Note that, the PCA method is particularly useful when the variables within the data set are highly correlated.[5]
  • XLSTAT provides a complete and flexible PCA feature to explore your data directly in Excel.[6]
  • PCA dimensions are also called axes or Factors.[6]
  • PCA can thus be considered as a Data Mining method as it allows to easily extract information from large datasets.[6]
  • XLSTAT lets you add variables (qualitative or quantitative) or observations to the PCA after it has been computed.[6]
  • The first edition of this book was the first comprehensive text written solely on principal component analysis.[7]
  • In order to achieve this, principal component analysis (PCA) was conducted on joint moment waveform data from the hip, knee and ankle.[8]
  • PCA was also performed comparing all data from each individual across CMJnas and CMJas conditions.[8]
  • PCA was used in this study to extract common patterns of moment production during the vertical jump under two task constraints.[8]
  • In biomechanics, PCA has sometimes been used to compare time-normalized waveforms.[8]
  • Hence, PCA allows us to find the direction along which our data varies the most.[9]
  • Applying PCA to N-dimensional data set yields N N-dimensional eigenvectors, N eigenvalues and 1 N-dimensional center point.[9]
  • A simple example is provided by comparing the singular spectrum from a singular value decomposition (SVD) with that of a traditional PCA.[10]
  • Note the robustness of PCA.[10]
  • Components are then grouped into subspaces preserving the order determined by the maximum variance property of PCA.[10]
  • λ N represent the eigenvalues from a PCA of the data.[10]
  • Principal Component Analysis is an appropriate tool for removing the collinearity.[11]
  • Right-click on the tab of PCA Plot Data1 and select Duplicate.[11]
  • The new sheet is named as PCA Plot Data2.[11]
  • Because of the versatility and interpretability of PCA, it has been shown to be effective in a wide variety of contexts and disciplines.[12]
  • PCA's main weakness is that it tends to be highly affected by outliers in the data.[12]
  • In the following sections, we will look at other unsupervised learning methods that build on some of the ideas of PCA.[12]
  • Find the principal components for one data set and apply the PCA to another data set.[13]
  • For example, you can preprocess the training data set by using PCA and then train a model.[13]
  • Use coeff (principal component coefficients) and mu (estimated means of XTrain ) to apply the PCA to a test data set.[13]
  • To use the trained model for the test set, you need to transform the test data set by using the PCA obtained from the training data set.[13]
  • The estimated noise covariance following the Probabilistic PCA model from Tipping and Bishop 1999.[14]
  • Implements the probabilistic PCA model from: Tipping, M. E., and Bishop, C. M. (1999).[14]
  • Some of the important differences and similarities between PCA and MLPCA are summarized in Table 2 and are briefly discussed here.[15]
  • One of the most convenient features of PCA that is lost in the transition to MLPCA is the simultaneous estimation of all subspace models.[15]
  • Of course, some properties of PCA remain the same for MLPCA.[15]
  • In addition, the columns of U and V remain orthonormal for both PCA and MLPCA.[15]
  • The PCA score plot of the first two PCs of a data set about food consumption profiles.[16]
  • Principal Component Analysis is a dimension-reduction tool that can be used advantageously in such situations.[17]
  • The main idea behind principal component analysis is to derive a linear function \({\bf y}\) for each of the vector variables \({\bf z}_i\).[17]
  • But if we want to tease out variation, PCA finds a new coordinate system in which every point has a new (x,y) value.[18]
  • PCA is useful for eliminating dimensions.[18]
  • 3D example With three dimensions, PCA is more useful, because it's hard to see through a cloud of data.[18]
  • To see the "official" PCA transformation, click the "Show PCA" button.[18]
  • In this section we will start by visualizing the data as well as consider a simplified, geometric view of what a PCA model look like.[19]
  • The PCA method starts with the "Road" class and computes the mean value for each attribute for that class.[20]
  • The PCA method computes class scores based on the training samples you select.[20]
  • Intensive Principal Component Analysis Classical PCA takes a set of data examples and infers features which are linearly uncorrelated.[21]
  • The features to be analyzed with PCA are compared via their Euclidean distance.[21]
  • This arises because both InPCA and PCA/MDS rely on mean shifing the input data before finding an eigenbasis.[21]
  • Thus, we view InPCA as a natural generalization of PCA to probability distributions and MDS to non-Euclidean embeddings.[21]
  • As an added benefit, each of the “new” variables after PCA are all independent of one another.[22]
  • If you answered “yes” to all three questions, then PCA is a good method to use.[22]
  • Our original data transformed by PCA.[22]
  • Here, I walk through an algorithm for conducting PCA.[22]
  • PCA is used in exploratory data analysis and for making predictive models.[23]
  • PCA is the simplest of the true eigenvector-based multivariate analyses and is closely related to factor analysis.[23]
  • PCA can be thought of as fitting a p-dimensional ellipsoid to the data, where each axis of the ellipsoid represents a principal component.[23]
  • PCA essentially rotates the set of points around their mean in order to align with the principal components.[23]
  • This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do.[24]
  • Many techniques have been developed for this purpose, but principal component analysis (PCA) is one of the oldest and most widely used.[24]
  • PCA can be based on either the covariance matrix or the correlation matrix.[24]
  • Section 3c discusses one of the extensions of PCA that has been most active in recent years, namely robust PCA (RPCA).[24]

소스

  1. 1.0 1.1 1.2 1.3 What does Principal Component Analysis (PCA) show?
  2. Principal Component Analysis
  3. 3.0 3.1 Principal Component Analysis
  4. Principal Component Analysis
  5. 5.0 5.1 5.2 5.3 Principal Component Analysis Essentials
  6. 6.0 6.1 6.2 6.3 Principal Component Analysis (PCA)
  7. Principal Component Analysis
  8. 8.0 8.1 8.2 8.3 Principal Component Analysis Reveals the Proximal to Distal Pattern in Vertical Jumping Is Governed by Two Functional Degrees of Freedom
  9. 9.0 9.1 OpenCV: Introduction to Principal Component Analysis (PCA)
  10. 10.0 10.1 10.2 10.3 Component retention in principal component analysis with application to cDNA microarray data
  11. 11.0 11.1 11.2 Principal Component Analysis
  12. 12.0 12.1 12.2 In Depth: Principal Component Analysis
  13. 13.0 13.1 13.2 13.3 Principal component analysis of raw data
  14. 14.0 14.1 sklearn.decomposition.PCA — scikit-learn 0.23.2 documentation
  15. 15.0 15.1 15.2 15.3 Principal Component Analysis - an overview
  16. What is principal component analysis (PCA) and how it is used?
  17. 17.0 17.1 6.5.5. Principal Components
  18. 18.0 18.1 18.2 18.3 Principal Component Analysis explained visually
  19. 6.5. Principal Component Analysis (PCA) — Process Improvement using Data
  20. 20.0 20.1 Principal Components Analysis Background
  21. 21.0 21.1 21.2 21.3 Visualizing probabilistic models and data with Intensive Principal Component Analysis
  22. 22.0 22.1 22.2 22.3 A One-Stop Shop for Principal Component Analysis
  23. 23.0 23.1 23.2 23.3 Principal component analysis
  24. 24.0 24.1 24.2 24.3 Principal component analysis: a review and recent developments