threshold:
https://stackoverflow.com/questions/46224752/what-is-a-threshold-in-a-precision-recall-curve
In general, an instance would be classified as A, if P(A) > 0.5 (your threshold value).
For this value, you get your Recall-Precision pair based on the True Positives, True Negatives, False Positives and False Negatives.
dev example
clean up incorrect data
Deep Learning is Robust to Massive Label Noise: https://arxiv.org/abs/1705.10694
deep neural networks are capable of generalizing from training data for which true labels are massively
outnumbered by incorrect labels.
dev set error