the basic principle for all of these methods is the same:
1. Take a large amount of data labeled with the characteristics being tested for ("true/false," "successful/unsuccessful", "terrorist/not-terrorist," etc.).
2. Extract features from the data (sentence structures and genres, actors & directors, nationality and itinerary, etc.).
3. Use various learning algorithms (e.g. neural networks, Support Vector Machines) to figure out correlations between features and labels. (The key here is that with large amounts of data, the algorithm's decision-making process is essentially unfathomable. This is why it seems like "BS" to a lot of people, because there no human reasoning going on, it's just juggling thousands and thousands of variables in enormously complex calculations.)
4. Feed unlabeled data into the algorithm, let it run, and it will produce a prediction, based on previous input.
In the interest of education, this is missing an important step: testing and validation.
You don't just want to develop your classifier on your entire dataset and then see what the accuracy is, because that will
always be biased and overconfident.
You want to do some kind of testing and validation, by dividing your dataset with known labels into two or more sets called "training" and "testing", and possibly one called "validation."
You use the "training" set to "train" your classifier. That is, the algorithm looks at all your data and their labels, and spits out a classifier you can use for prediction of new data. If your classifier were a dragon, and you were training it, this is where the montage would be.
Apply that to your "testing" set (for which you know the labels, but your algorithm doesn't) to evaluate the accuracy.
Sometimes, classifiers will have user-set parameters that the experimenter can set (as opposed to the parameters that the algorithm calculates directly from the data), that may effect the results. Basically, ways to "tweak" the algorithm.
This is where the "validation set comes in. You do lots of iterations on the "training" and "testing" set to tweak the parameters to give the best performance. Then finally run your final classifier on the "validation" set (where again, you know the labels but the algorithm doesn't) to give the final accuracy that you should report, that is the least biased.
Another way to accomplish this is "cross-validation," where you split the data into multiple groups (or "folds"), and iteratively use each group as the training set, and test on all of the remaining groups. Then you report the average accuracy.
Never trust a paper about prediction that doesn't do some kind of testing/validation.
(This paper used 5-fold cross-validation. You can find their cross-validation scheme in the "data description" link under "Downloads" on the researchers' site
here.)