About

The Data

Information was scraped from this page on 2023-06-28. Over 5000 abstract titles from 2023 were used as training data. Embeddings for these were generated through the OpenAI API, with the text-embedding-ada-002 model.

The Model

A penalized logistic regression model was fit using the glmnet R package. The tuning parameter was selected using cross validation. The area under the ROC curve was 0.83 in the training data.

back