Who Wrote the Op-Ed: Text-Mining Edition

When the cowardly “Resistance” op-ed came out, my first thought was, Gee, I bet we could get some insights on authorship by doing an automated textual analysis. Because of course that was my first thought. Well, somebody was kind enough to do one for us. Specifically, Michael W. Kearney, a journalism and informatics professor at the University of Missouri. Here is the result; I’ll do a layperson’s explanation below, and then some technical links for those so inclined.


Executive summary: This analysis suggests that it was somebody from the office of the Vice President, the State Department, or the Department of Commerce.

What is this?

  • The y-axis is various Twitter accounts, labeled on the left.
  • The x-axis is the textual correlation.
  • Kearney took up to 3,200 tweets from each of the accounts listed, and ran an analysis on those corpuses. He then compared the resulting numbers to the results of the same analysis run on the text of the op-ed.
  • The line at the top shows, of course, a 1.0 correlation with the op-ed itself. The next-highest are the Twitter accounts for the Vice President, Trump (who we can discount), Secretary Pompeo, Secretary Ross, and the State Department.
  • The analysis includes figures for things like comma usage, sentiment, politeness, word choice, first- and second-person preference, and so on.
  • It probably wasn’t somebody at the Department of Transportation.


  • Update: I assumed this went without saying, but obviously tweets are not an ideal data source; just most-readily usable with what Kearney had laying around, and within a very short time period. 
  • We know from reporting on the Wolff book that anonymous sources sometimes intentionally steal other staffers’ phrasing when providing quotes.
    • This could explain the use of ‘lodestar,’ a strongly Pence-affiliated word.
    • However, it is harder to fake things like comma usage.
  • Higher-ranking officials are likely, in their Twitter communications, to try to sound more like Trump, or in general use more homogenous language.
    • This could explain the ~0.7 cluster of the most important officials and departments.
  • These are not huge volumes of text, and thus the figures are potentially not representative.

Technical Details

Read more

A policy analysis question

Later on today, I’m meeting up with two referees to go to Middle of Nowhere State College for a soccer game tonight. One of my colleagues is a public policy post-doc at the local policy school and the other is a business prof focusing on economic development issues. I work with this crew once or twice a year and it is usually the geekiest and most enjoyable drives of the season.

The policy geek is always look for good teaching examples and I think we have a good one to sketch out tonight.

Over the past couple of years, the local high school soccer association and high school leagues have made a variety of changes to expectations and rules. Here are some of the changes in practice and expectations.

1) When there is a sub-varsity/varsity doubleheader the start time moved from the traditional 6:00/7:30 or 6:30/8:00 split to a uniform 5:30/7:00pm. —-Referees are expected to be at the field at least 30 minutes before kick-off.
2) All referees must renew their clearances annually even though state law allows for three years of validity for educators.
3) 12 hours of in-season training (on nights when college or high amateur leagues play) instead of the previous 6 off-season hours and four in-season hours that were better scheduled to avoid conflicts. Most of the pre-season hours could be satisfied by either a college or USSF intermediate clinic.
4) New uniforms that are unique to the high school game and can not be used for USSF or NCAA games (previously we used USSF shirts, shorts and three striped socks)
5) Formal assessments were discontinued. Now the play-off pool is determined by a combination of seniority (length in the chapter) and three season average varsity game count.

Pay has not changed since 2008.

The local high school referee group had 190 members in it for the 2015 playing season. That broke down to roughly 30 people whose highest level games were either professional or scholarship college, another 20 refs whose highest level games were D-3 college, ethnic mens’ amateurs or Regional USSF youth ball and 140 refs whose highest level games were either low level State Cup or high school depending on how they were being assigned. I was uncomfortable working with about ten refs as they found ways to screw games up in old and uncreative ways. Last year, we sent 24 referees to the state playoff system, twenty one of them had as their highest level game either a professional match or a scholarship game.

Using basic policy analysis skills can we make any predictions about the impact of these policy changes?
Read more

The Politicization of Policy

Earlier today the Supreme Court, in a 4-4 deadlocked ruling pertaining to President Obama’s Executive Order pertaining to the status of the parents of American citizens or legal residents who are in the country illegally, issued the following ruling: “The judgement is affirmed by an equally divided Court.” In the short term this means that the original District Court ruling, affirmed by the 5th Circuit Court of Appeals, stands. It is unclear whether this means that the President will seek to enforce his executive order to not deport the parents of American citizens or legal residents outside of the 5th Circuit or not. The ruling is partially the result of Texas and 25 other states shopping for a sympathetic District Court Judge, which is why they filed it in Brownsville, not Austin the state capitol. It demonstrates both the challenges of a divided Supreme Court and the politicization of policy.

While Speaker Ryan has issued a statement lauding the decision and claiming it as a victory for the Constitution and Congress, specifically under Article 1, this is simply part of the politicization of this particular policy. And that comes at a price. Both in lives affected and in dollars spent. The reality that no one wants to mention when discussing the President’s DAPA and expanded DACA order to defer deportations for specific, low risk classes of undocumented people in the US, and which demonstrates why Speaker Ryan’s claiming victory for Article 1 and the Congress’s power to write the Law, not the Executive Branch, misses the point is that Congress did write the Law. Congress made it a misdemeanor to improperly enter the US; specifically entering in an undocumented capacity without papers while avoiding immigration control. Unlawful presence, overstaying one’s visa or not leaving the US and returning to one’s home country when one is supposed to is not actually a crime at all. The Executive Branch, however, has to administer (execute) this law. But here’s where the rubber of making Law hits the road of enforcing it: Congress also has to provide the ways and means.

Currently Congress only appropriates enough money for the Department of Homeland Security to deport approximately 450,000 undocumented immigrants that have illegally entered or overstayed their visas. This is not something new. Congress never appropriates enough money to deport everyone who has entered illegally or overstayed their visas. The cost for trying to identify, round up, and deport all of the estimated 11 million undocumented people – both improper entry and unlawful presence – in the US right now is estimated at no less than a $100 billion and up to $600 billion. As a result every Presidential Administration has had to prioritize who to focus on. The focus is always on those who have been arrested and/or previously convicted of engaging in violent crimes or who are tied to human or drug trafficking or terrorist/extremist organizations. And this makes sense from a domestic, public policy standpoint: focus on those who present the greatest potential threat to the US, American citizens, legal residents, and those visiting the US. What Speaker Ryan, Governor Abbot of Texas and his 25 colleagues from when he was the Texas Attorney General, Federal District Court Judge Hanen, the 5th Circuit Court of Appeals, and the four Supreme Court Justices that voted to uphold the lower court rulings against the Administration’s Executive Orders have chosen to ignore is that tomorrow the Obama Administration still only has enough Congressionally appropriated funding to deport 450,000 people in the US illegally. And tomorrow the Department of Homeland Security is still going to have to prioritize who they focus on – the parents of an American citizen who other than the Federal misdemeanor of improper entry or the not an actual crime at all of unlawful presence are otherwise law abiding or the guy trafficking women for the sex trade.

We’ve reached this moment of policy and juridicial stupidity because both the President and those opposing his policy of prioritization politicized the issue. The President publicly announced the policy of placing the parents of US citizens and legal residents on the low priority list for deportation, which provided them with an effective exemption. President Obama did this as part of a strategic communication strategy to signal to an important constituency that he, and the Democratic Party, were not going to forget them even if Congress was unable or unwilling to act. The House GOP majority, as well as twenty-six Republican controlled states, responded by also strategically communicating to their constituencies that they would sue the President to overturn his Executive Order to ensure that the Law was administered and that only Congress, as Article 1 states, can write Law. The issue, which was already politicized, was dialed up to 11.

There is no way of knowing if, had the President not publicly announced what he was doing, the GOP House Majority or one or more of these 26 Republican governed states would have still objected as vehemently or opposed the President’s actions through a lawsuit. Moreover, there isn’t equal guilt for politicization on both sides. Until or unless Congress appropriates more funds for deportations, which they do not seem to be inclined to do, the Obama Administration, and any subsequent administrations, will only have the funding – the means – to identify, arrest, detain, and deport 450,000 undocumented people per year. No matter what Judge Hanen, the 5th Circuit Court of Appeals, or the Supreme Court rules, tomorrow the Department of Homeland Security, part of the Obama Administration’s Executive Branch, will still have to prioritize who to deport. I fully expect that they will continue to prioritize their efforts on those accused of and/or convicted of violent crimes, as well as those suspected to be trafficking drugs and people or of being affiliated with extremist or terrorist organizations. Focusing on less dangerous cohorts among the undocumented would create an actual threat to the safety and security of the US, its citizenry, its legal residents, and those visiting for work, school, or enjoyment.

Sort of Maybe a Bit Like Friday Recipe Exchange on Monday: Do NOT Try This at Home Edition!!!!!

Alton Brown has been tinkering again. He’s invented a way to make ice cream in under 10 seconds. The video is below. Whatever you do, do not try this at home!

Bon appetit! And open thread.

ERISA and All Claim databases

Nicholas Bagley is worried about a Supreme Court case.  In Liberty Mutual, a self-insured company that has an administrative services only (ASO) contract with an insurer, is arguing that the requirement to submit claims data to the state of Vermont is a violation of the ERISA law that does not allow states to tell companies how they administer benefits to their employees.

It’s thus perverse that the Supreme Court is poised to rule in a case that could thwart efforts to get good data about health-care prices. In Gobeille v. Liberty Mutual, the Court will decide whether the Employee Retirement Income Security Act of 1974 (ERISA) supersedes laws, on the books in 18 states, requiring self-insured employers to report data about the prices they pay to “all payer claims databases.”

Because about two-thirds of all employees receive coverage through self-insured firms, exempting those firms from the reporting obligation would blow a giant hole in the state databases. If you’re persuaded that we’re paying too little attention to the problem of market power—and I am—then ruling against the states in Gobeille would be especially boneheaded.

There is a sub-optimal work around if the Supremes rule for Liberty Mutual.

Most ASO contracts use standard networks and benefit configurations attached to a standard plan design.  ASOs will often seek cost savings by having their administrator craft narrow networks that carve out certain high cost providers.  But they are still attached to a standard plan design.

From a provider point of view, the provider can’t tell if a patient is Mayhew Insurance Fully Insured or Mayhew Insurance ASO employer self-insured.  They get paid the same rate for the same ste of services if the patient is in a Mayhew PPO or a Mayhew HMO.  The contract that a provider has with the insurer is far broader than the numerous options ASOs believe that they are getting.  One provider contract can and does cover a hundred network tweaks and eight hundred cost-sharing variants.

Most ASOs are fairly small (under 5,000 covered lives).  There is a limit to what an insurance company is willing to do to customize a plan.  Re-slicing a network is fairly easy.  That task could be anywhere from an afternoon if we were slicing out the most expensive 1% or 2% of the providers (a fairly common request) to a month if we were building a home host multi-tier with appropriate provider access designed to funnel money to a provider and take money away from a competing provider group.  The big challenges would be presenting a draft model to the clients built according to their written specifications and having them come back and tell us to add their CFO’s cardiologist and their CHRO’s endocrinologist back into the mix.  And those two docs belonged to the group that they were trying to screw.

What does not happen in most ASO custom plans  is a rewrite of provider contracts.  Renewing and rewriting provider contracts and more importantly, having providers sign onto a new contract is much more expensive than recutting an already contracted network.   For the Mayhew narrow Exchange product 70% of the prep cost was getting new contracts out to providers to sign.  This changes when the ASO is large enough (Boeing and Starbucks in Seattle are doing some very interesting things on healthcare where contracts need to be rewritten) but most ASOs will have provider pricing similar to fully insured groups.

Since provider pricing is similar, probabilistic matching could be used to create demographically similar dummy members and their projected claims experience could be estimated within a useable but wide error band.

This is not ideal, it is a third best hack to solve a problem of a company not wanting to give a massive text file data dump that does not cost them a lot to either produce internally, or request from their administrator.  Their administrator already produces that file anyways in order to bill the self-insured company.  The ideal case is the Supreme Court says this is a reporting requirement and not a benefit requirement so GTFO.