Skip to content

Conversation

@AmanSikarwar
Copy link

@AmanSikarwar AmanSikarwar commented Oct 27, 2023

Team Members:

  • Roll no. B22150 : Aman Sikarwar
  • Roll no. B22003 : Arnav Thakur
  • Roll no. B22005 : Aryan Kumar

Description

Preprocessing

  • Why KNN for filling in NaN values?

We replaced all the missing values with mean first and plotted their histograms.
We could notice that as there were so many NaN values, the frequencies of respective mean data saw a huge spike.
To avoid that we tried the same with KNN (K=5). The resulting graphs were fairly smooth so we decided to go with it.

Regression

Categorical data were avoided for linear regression.

Classification

We tried various algorithms starting with Bayes Classifier( Accuracy was 64%).
Next we tried random forest

@Techtronics21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant