This page is created (for the moment) as a set of Questions and Answers
What is a ROC curve?
Very good definitions, as usual in wikipedia.
Why am I interested in ROC curves?
Because they are deeply rooted in history of Psychological research, especially Psychophysics (more specifically Signal Detection Theory). But more important, because they are a true multidisciplinar methodology.
In the 70s it was used for the assessment of image diagnostic systems, and became a standard in this truly multidisciplinary field.
I applied it in my Ph. D. thesis for a problem of marketing research.
What research areas dealing with ROC curves I am interested in?
I am interested in the relation between precision and recall curves and ROC curves. There is a key paper about this issue, by
Davis and Goadrich (2006). Find the paper in their site:
http://pages.cs.wisc.edu/~richm/articles/davisgoadrichcamera2.pdf. This has been a very influential paper as you can see from the cites in scholar.google. They even provide a java program to compute ROC curves from PR curves and vice-versa:
http://pages.cs.wisc.edu/~richm/programs/AUC/
Definetly, a most interesting site to visit: http://pages.cs.wisc.edu/~richm/
From the cites in scholar.google you will also find the following:
J Burez, D Van den Poel (2008): Handling class imbalance in customer churn prediction - Expert Systems With Applications, 2008 - Elsevier
This is a most interesting paper about the problem of predicting churn, and uses PR and ROC curves for evaluation of different algorithms. Now reading it.
AL Garcia-Almanza, EPK Tsang, E Galvan-Lopez ('): Evolving Decision Rules to Discover Patterns in Financial Data Sets
Well, what to say about patterns in financial data sets these days? A curious application…
Data mining with rarity (very few positive examples in comparison with total number) is still a challenge. Some key references:
Weiss, G. M. (2004): Mining with rarity: a unifying framework. ACM SIGKDD Explorations Newsletter, 2004,Volume 6, Issue 1 - Page 7
S Visa, A Ralescu (2005): Issues in mining imbalanced data sets-A review paper. Proceedings of the Sixteen Midwest Artificial Intelligence. Journal of Artificial Intelligence Research 19:315-354.
Every single day more and more surprised of internet for a scholar. This is an essential book on information retrieval, from 2008, cambridge university press, and available here:
http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html





