News Scraping & Text Analysis
If you're a Stanford GSB researcher who needs access to lots of news articles for a text analysis or other large-scale quantitative project, please contact us so we can explore your options.
We've listed some sources Stanford researchers have access to below, but this list is not exhaustive.
- TDM Studio
- From ProQuest, TDM Studio is a text and data mining solution for research, teaching, and learning.
- Includes access to current and historical newspapers. Popular titles include the WSJ and NYT, but many others are also available. Coverage is slightly better for national papers in the US and around the world compared to local news.
- Based on Access World News, which tends to have good local newspaper coverage.
- There will be a fee for each research project; GSB members can contact us for more information.
Below are a few examples of popular newspapers or magazines that are available through either the GSB Library or Stanford Libraries for text and data mining projects. Most of them have a mix of xml and tif formats. If you're a GSB researcher looking for another publication or are interested in using them, please Ask Us.
- Washington Post - content from 1977 that can be used for text analysis projects.
- Economist Archive - 1843-2015
- The Times Digital Archive (London Times) - 1785-2014 (data is limited for 1979 due to strikes)