I have a data set.
My data set contains a movie synopsis and its equivalent rating or classification. I have around 3000 rows/records in my data set categorized into Foreign or Local movie.
The data mining task goes like this:
1. From the data set available, identify which Classification algorithm works best in classifying new movie in terms of accuracy. The three algorithms are J48 Decision Tree, Naive Bayes and Neural Networks.
2. From number 1, select the best performing algorithm and build the model.
3. The model built will be ingested in a web application or desktop application or mobile application that will classify new data. The input will be the synopsis/plot of the new movie and the film type (either Foreign or Local). The output will be the probability of the movie belonging to a specific class. In this case, I have 7 classes available.
You can use WEKA or Python.
Sample data set is attached.
Let me know for further details.