TEG, (edited )
@TEG@mastodon.online avatar

Some methods to determine the number of components or clusters in PCA or k-means clustering: https://thomasgladwin.substack.com/p/finding-the-true-number-of-components/. These at least work in the limit of ideal simulated data.

The basic rationale is to use random split-half data to identify what's "true" versus sampling error. Scores are based on similarities between eigenvectors or cluster centres, rather than, e.g., the shape of the eigenvalue plot.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • programming
  • GTA5RPClips
  • DreamBathrooms
  • InstantRegret
  • ethstaker
  • magazineikmin
  • Youngstown
  • thenastyranch
  • mdbf
  • slotface
  • rosin
  • modclub
  • kavyap
  • cisconetworking
  • osvaldo12
  • JUstTest
  • khanakhh
  • cubers
  • Durango
  • everett
  • ngwrru68w68
  • tester
  • normalnudes
  • tacticalgear
  • anitta
  • megavids
  • Leos
  • provamag3
  • lostlight
  • All magazines