Posted on

Words of Persuasion – The Text Predictors of Persuasive TED Talks

What makes for a persuasive presentation? How can you speak to persuade? To create, I webscraped every TED Talk, analyzed words in the transcripts & user’s ratings, and found that the most persuasive speakers used MORE negative emotion words and LESS ‘I, me, my’ words.

So the next time you’re trying to convince, try calling attention to the negative emotions (‘sadness, frustration, anger’) the problem causes and to the ways ‘we,’ ‘us,’ or ‘you’ — not ‘I’ can solve them. At, you can also paste in the text of your own speech to see if it’s likely to be persuasive or not… is the result of a natural language processing project to reveal linguistic and psychological features that predict a persuasive TED Talk. I webscraped 2600+ TED Talk transcript and it’s metadata through 2017 and then used decision trees, random forest regressors, and linear regression to find key predictors of persuasive ratings by viewers.

I found that the change in negative and positive emotion words across the talk and the speaker’s use of key social pronouns like “I” and “we” made a big impact on persuasive ratings.

My analyses resulted in important categories of words that make up a “linguistic signature” of persuasion and a classifier that you can use to predict the persuasiveness of your own text.

For professionals who need to communicate and influence others – is a data product that uses natural language processing techniques and statistical modeling to provide insights on how to speak to persuade.

Posted on

What’s your favorite Austin restaurant? (What was it’s last health inspection score)

Last night, the organizers of a meetup at issued a challenge: in 1 hour, pull together 1 visualization that tells a story from an open dataset. Our team created an interactive dashboard to explore 3 years of good and bad restaurant health inspection scores in Austin.

Search by restaurant name, choose your zip code area to see scores in your area, and see the trend in scores over time in the dashboard below. Click on a specific restaurant to see the scores over time.

I enjoyed working with Keisuke Irie, Scott Kurland, Andrew Riddle, and Vinay Bhat on this, talking good food spots and data-

Posted on

The greatest films of all time- a list compiled from multiple sources

This web app and open dataset I made for movie fans compiles several ‘greatest films’ lists to find the greatest of the great, and the analysis reveals seven films to be the best of the best.

Here's a Movie Recommender visualization and interface for the data hosted on Tableau Public:

The Problem

There’s nothing worse than sitting through a bad movie.

(OK, there are many things that are worse, but it’s still a bummer.)

So I always check reviews, recommendations, and ratings before committing to a film. If it doesn’t pass a certain threshold on a few of my trusted sources, I don’t watch it.

I want to be generally familiar with the history of film, and I want to see the movies that many other informed folks agree are worth seeing. I’m a sucker for a good story.

Some films influenced the culture in big ways and changed the art of filmmaking. I’d like to see as many of those movies as I can.

The Solution

To provide myself with a reliable movies ideas, I have been collecting lists of “The Greatest Films of All Time.” I have lists created by film critics, film industry leaders, and screenwriters, and I decided to put them all together in one giant spreadsheet. The complete dataset I compiled is posted here as a CSV.

I will be using it to navigate movie choices, and since there are 1,212 titles on the master list, it’s gonna to be a multi-year journey. The source lists are “greatest films” publications from the American Film Institute, the Writer’s Guild of America, The Sight & Sound Top 50, The Guardian, and 1001 Movies to See Before You Die.

I will be moving through the master list watching the films that many film experts, critics, and screenwriters agree are the best movies ever made.

What Movies Are On All ‘Greatest Films’ Lists?

Using Python in a Jupyter Notebook, I imported the csv as as a pandas dataframe and performed a basic query to find which movies appear on every list.

There are 7 films that appear on every list and they are: Citizen Kane, The Searchers, Some Like It Hot, Psycho, The Godfather: Part I, The Godfather: Part II, and Apocalypse Now.

The 7 films that appeared on every list

I’ve seen all those films, so I used another query to find which movies appear on 6 of the 7 lists. (I dropped the Sight & Sound Top 50 list.)

The movies on 6 out of 7 lists were: Gone With the Wind, The Wizard of Oz, Casablanca, Double Indemnity, North by Northwest, The Apartment, Dr. Strangelove, The Graduate, Butch Cassidy and the Sundance Kid, The Wild Bunch, Chinatown, Annie Hall, Star Wars, Raiders of the Lost Ark, E.T. The Extra Terrestrial, Goodfellas, and Pulp Fiction.

North by Northwest (1959) directed by Alfred Hitchcock

I’m going to start this hero’s journey through cinema history with North by Northwest. It’s the oldest movie that I haven’t seen yet that appears on 6 out of 7 of the greatest films of all time lists.

What sources do you use for picking the next movie you see? Rotten Tomatoes Percent Fresh score, IMDB star rating, or a list you keep around? How many of the top 7 have you seen?

Posted on

Comedy Central Presents – Complete Episode List with IMDB Ratings

If you are serious about laughing, here’s a dataset for you and a visualization (below and here) you can use to explore it.

Good stand-up comedy looks easy, but it is a structured, precise, and subtle form of communication, a tightrope walk in front of a live audience, that is always a millisecond or a mumble away from failure.

To see masters and up-and-comers at work in front of a great audience is a privilege, and that makes the Comedy Central Presents series a treasure trove of great performances by the best and brightest in stand-up comedy.

Comedy happens on the edges, and there are various genres within stand-up comedy that you might or might not dig. If your tastes align with the users of, you might want to use their ratings to help you dive in to this series.

The dataset has IMDB id, IMDB episode info URL, and URL to view the episodes on Amazon Instant Video.


Which performers had the highest rated appearances?

  • Stephen Lynch, Mitch Hedberg, Pablo Francisco, Brian Regan, and Gabriel Iglesias had the top 5 best ratings

Highest Rated Performances

Which performers had the most number of performances?

  • Lewis Black appeared on the program 3 times, and 21 other comedians appeared 2 times on the series

Rating of Multiple Appearances

What season had highest ratings on IMDB?

  • Seasons 3, 5, and 1 were the best rated at 7.41, 7.40, and 7.29 respectively 

Best Season

What performances are most reviewed on IMDB? 

  • Performances by Mitch Hedberg, Dane Cook, and Pablo Francisco got more than 400 reviews to date

Performers with Highest Number of Reviews

Were there differences in the linguistic content of the various performances?

  •  I am collecting subtitles for each performance in preparation for this analysis
Posted on

It’s All Connected

The ToK System as described by Henriques (2003).

It’s all connected isn’t it? It’s not just six degrees of separation from any actor to Kevin Bacon, it’s a low number of degrees of separation between anything and anything else in the world.

Start with any observation or fact and you can get to any other observation or fact through nodes and trees and branches and leaves.

Above and below are some cool visualizations related to some fascinating explanations of interconnectedness.

Looking at these images will probably make you feel more connected to your fellow human (animals and plants too) than looking at your iPhone. (Unless you are looking at these images on your iPhone.)


The Tree of Life as described by


If you want more than pictures, the full text and rationale behind these images are at The Tree of Knowledge System, The Tree of Life, and The Computer Tree.

In the weeks and months ahead, more geeky explorations of science, art, and culture will happen on this site.

Until then, stay in touch. But then, you can’t help but be, right?

An Updated Computer Tree (to 1970) from the endpapers of the Computer Yearbook for 1969 (Computer yearbook and directory Detroit [Mich.]: International Electronics Information Services, 1969–1971)