Skip to Main Content

Knowledge Base

News Scraping & Text Analysis

Some vendors prohibit you from feeding data/text from library resources into an AI program (e.g. ChatGPT). If you want to do this, Ask Us so we can check the license terms.

If you're a Stanford GSB researcher who needs access to lots of news articles for a text analysis or other large-scale quantitative project, please contact us so we can explore your options.

We've listed some sources Stanford researchers have access to below, but this list is not exhaustive.

These resources are generally meant for faculty or PhD research. If you're an undergraduate or masters student, please contact your primary campus library with any questions.

Subscriptions/Platforms

  • TDM Studio
    • From ProQuest, TDM Studio is a text and data mining solution for research, teaching, and learning.
    • Includes access to current and historical newspapers. Popular titles include the WSJ and NYT, but many others are also available. Coverage is slightly better for national papers in the US and around the world compared to local news.
  • Newsbank
    • Based on Access World News, which tends to have good local newspaper coverage.
    • There will be a fee for each research project; GSB members can contact us for more information.
  • LexisNexis API
    • Provided by Stanford Libraries and provides access to the News and Legal (US Federal and State) content within NexisUni.
  • PeakMetrics Data
    • For a per-project fee, Stanford researchers can work with PeakMetrics to receive a custom dataset covering freely available (e.g. non-paywalled) online news and blog articles from 2019-2024. The available data provides information at an article or post-level, including the text of the mention, the authors and source, and a sentiment subjectivity and polarity measure.
    • GSB researchers should Ask Us; other Stanford researchers should email Regina Roberts.

Individual Publications

Below are a few examples of popular newspapers or magazines that are available through either the GSB Library or Stanford University Libraries for text and data mining projects. Most of them have a mix of xml and tif formats. If you're a GSB researcher looking for another publication or are interested in using them, please Ask Us.





Answered By: Alice Kalinowski
Feb 18, 2025

Related Library Tips

    Resource Use

    Most resources are only available to current Stanford students, faculty, and staff.

    Researchers are responsible for using these resources appropriately. See the eResources Usage policy.

    Accessibility Support

    Ask Us for accessibility support with library resources.