Cisco Interview Question | Dimension Reduction

Question

You are given a train data set having 1000 rows and 1 Million columns. The data set is based on a classification problem. You are asked to reduce the dimension of this data so that model computation time can become manageable. What will be your suggestion?

(You are free to make practical assumptions.)

in progress 0
Dhruv2301 55 years 1 Answer 903 views Great Grand Master 0

Answer ( 1 )

  1. There are various ways to reduce dimensions:
    1. Use L1 or lasso regression where the non-important parameters will be eliminated
    2. Use Principal Component analysis
    3. Use t-SNE(t-distributed stochastic neighbour embedding)

Leave an answer

Browse
Browse