Mining Politico newsletters for a view across the country

Dive into the data yourself: https://legislata.shinyapps.io/PoliticoWordApp/#

At the heart of politics is knowing what’s happening. That can be tough when there’s more happening than any one person can know about.

There’s only so much time in the day and, for those in government, much of that is chewed up by constituent requests, community meetings, traveling to events, and policy research. Finding the time to zoom out from the granular to take a view of the overall can be impossible. 

However, there are tools that can help. With techniques collectively known as text mining, we can turn a massive amount of unstructured text, like hundreds of press releases, thousands of emails, or millions of tweets, into charts and graphs that help us understand our political world a little bit better.

To illustrate this, we’ve mined two months of the daily Politico Playbooks for California, Florida, Illinois, Massachusetts, New Jersey, and New York. These are some of the most comprehensive and high-quality reporting of politics in each of those states, and so can serve as ideal snapshots of what was happening every day of the last two months. In this particular case, we want to use text mining to tell us how state politics is the same or different across the country.

Before we begin, you can explore the data in a free interactive web app: https://legislata.shinyapps.io/PoliticoWordApp/#

Let’s dive into four things we learned from this data.

1. Things turned positive in early May.

Sentiment analysis is an approach that scores words as either positive or negative to find the average sentiment of a piece of text. “The happy child laughed” has two positive words in it (“happy” and “laughed”) so that sentence would score higher than “the sad child cried”, which has two negative words in it.

By scoring all the words in the Playbooks, we see that there was an uptick in positive words in the first half of May. All states saw boosts there, with big spikes for New York and California in particular.

politico_daily_sentiment.png

Why does this matter? For one, analyzing the sentiment of a large corpus of text can help detect patterns that may be too subtle to notice through reading (or if the text is too big to read). A rise in positive words in these newsletters may indicate that the news became more optimistic in these weeks and indicative of a better economy or legislative progress. A negative trend could indicate gridlock or stagnation.

It also may show us what states are seeing similar environments. We might expect that New York and New Jersey are closely correlated, but the Garden State seems to be a couple of weeks behind its neighbor in getting more and then less positive about the news. If this is a constant pattern, it may mean that economic indicators in New York may lead New Jersey (or other political factors that become apparent with further study).

politico_state_sentiments.png

2. State-specific topics show up in the text

Over the last couple of months, the Boston Police Department has been in the news frequently, as the Commissioner has been fired after being placed on leave and the former Patrolman’s Association leader was revealed to have been accused of child molestation.

We see that in the Playbooks. “Police” is mentioned far more in Massachusetts than in any other state, despite police reform being a national topic of conversation.

politico_police.png

While this is a rough proxy for the underlying issues - the word “police” doesn’t indicate whether it’s referring to a scandal, legislation, or a human interest piece - the fact that these are Politico newsletters suggests that it means something politically sensitive is happening and is more likely the subject of political debates. 

It also suggests that tracking Politico newsletters for particular words could give someone advance warning that an issue is starting to trend across the country even if they don’t have the time to read every newsletter themselves. Keywords like “infrastructure” or “climate” popping up in one part of the country are an indication of where legislation may advance before it makes a move in the legislature. 

3. National politics are still local

While the 46th president of the United States has been a national topic of political conversation for six years now, he is now more discussed in his current geographic context. 

“Trump” appeared about twice as much in Florida’s Playbook than any others’. Whether this is because he plays a large role in Republican politics, and Florida is the only red state in this sample, or because he lives there is difficult to tell. 

However, someone from the other side of the political spectrum indicates that it could be the local angle. Vice President Kamala Harris is a national figure, but she has been overwhelmingly mentioned in the California Playbook. It suggests that readers on the West Coast are more keen to hear about what their former Senator is doing than the rest of the country and that national events are still filtered through a local prism.

harrispie.png

4. Economic news is more positive than COVID news

The spike in positivity in May got me wondering if it was due to positive economic news or about increased vaccination rates. It appears that they could be related, since mention of the economy and COVID popped at the same time as the sentiment.

politico_econ_states.png
politico_covid_states.png

While the caveat that correlation is not causation, I looked at the connections between the three issues.  There was a small, but statistically significant correlation between how positive a state’s newsletters were in a week and how frequently they mentioned the economy or COVID. Economy had about twice as large of an impact as COVID.

politico_sentiment_econ.png
politico_covid_sentiment.png

So what does this mean?

This is only scratching the surface of what we can do with text mining political content. With more months of Playbooks, or news from different states, or techniques like topic modelling, or correlation with other measures, we can get an understanding of politics that would otherwise require more reading than there is time in the day to read.

Try for yourself with the Words in State Politico App here. It shows you the distribution of a word between the states since early April. We’ll upload it every month, so be sure to check it out again later.

How does Legislata help

Legislata is the workplace productivity tool for state legislators and their staff. As with readers of state-level newsletters, you also have more information that you likely know what to do with. Our platform allows users to track constituent email, manage office tasks, and collaborate with peers in a single app.

With Legislata’s suite of tools, public servants can easily check their districts’ pulses, identify operational bottlenecks, and make data-driven policy decisions.

Legislata is in development. Sign up to be included in our closed beta test now, launching this summer.





Previous
Previous

Who does Beacon Hill follow on Twitter?

Next
Next

Connecting the Cosponsorship Dots