Lexical Analysis of 2012 Presidential Debates — Obama vs Romney Martin Krzywinski projects contact

Chrome no longer supports Java NPAPI. To create Wordles directly from debate analysis tables, use Firefox or Safari or Explorer.

home > results and commentary > Biden vs Ryan

Lexical Analysis of Joe Biden vs Paul Ryan

Introduction

While the presidential candidates get three tries, there is only one vice-presidential debate. Let's see how Biden and Ryan fared.

Word Statistics

Debate Word Count

Summary Word Count

The summary word count reports the total number of words and the number of unique, non-stop words used by each candidate. Word number is expressed as both absolute and relative values.

Table 1a
all words
Number of all words and unique words used by each speaker.
set word count
Joe Biden
6,631 1,235
46.2% 18.6%
53961235
Paul Ryan
7,708 1,419
53.8% 18.4%
62891419
total
14,339 2,013
100.0% 14.0%
123262013

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 1b
exclusive and shared words
Words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word count
Joe Biden
899 594
13.6% 66.1%
305594
Paul Ryan
1,340 778
17.4% 58.1%
562778
both candidates
12,100 641
84.4% 5.3%
11459641

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 1
commentary
Biden slowest speaker.

Biden spoke for longer (41:32 = 2,492 seconds) than Ryan (40:12 min = 2,412 seconds) (debate timing by CNN).

Ryan used +16.2% (7,708 vs 6,631) more words than Biden — he spoke faster with a rate of 3.20 words/second, +20.3% (3.2 vs 2.66) faster than Biden's 2.66 words/second. Biden and Ryan delivered nearly the same relative number of unique words, with Biden edging Ryan slightly by Δrel=+1.1% (Δabs=+0.2%, 18.6% vs 18.4%).

Biden delivered -8.9% (6,631 vs 7,280) fewer words than Obama and spoke -6.0% (2.66 vs 2.83) words/second more slowly. Ryan's speed was also slower than his mate, Ryan, by -16.0% (2.83 vs 3.37), and he delivered slightly fewer words, -1.1% (7,708 vs 7,791). Biden delivered -8.9% (6,631 vs 7,280) fewer words than Obama and spoke -6.0% (2.66 vs 2.83) words/second more slowly.

The debate was more dynamic than the presidential debate, focusing on a greater variety of issues. Correspondingly, the vice-presidential candidates' unique word fraction was higher than their presidential counterparts in the first debate. Recall that Obama had a unique word fraction of 17.2% while Romney had 15.6%. This debate saw Biden with 18.6% and Ryan with 18.4% unique words.

Table 1
legend
a c
b d
3010

a :: word count

b :: word count, as fraction in total in debate

c :: unique words in (a)

d :: unique words in (a), as fraction in (a) bar :: proportion of (a-c):c

Stop Word Contribution

In the table below, the candidates' delivery is partitioned into stop and non-stop words. Stop words (full list) are frequently-used bridging words (e.g. pronouns and conjunctions) whose meaning depends entirely on context. The fraction of words that are stop words is one measure of the complexity of speech.

Table 2a
non-stop words
Counts of stop and non-stop words.
speaker all stop non-stop
Joe Biden
6,631 1,235
100.0% 18.6%
53961235
3,792 142
57.2% 3.7%
3650142
2,839 1,093
42.8% 38.5%
17461093
Paul Ryan
7,708 1,419
100.0% 18.4%
62891419
4,227 142
54.8% 3.4%
4085142
3,481 1,277
45.2% 36.7%
22041277
total
14,339 2,013
100.0% 14.0%
123262013
8,019 158
55.9% 2.0%
7861158
6,320 1,855
44.1% 29.4%
44651855

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 2b
exclusive and shared non-stop words
Non-stop words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word count
Joe Biden
874 578
30.8% 66.1%
296578
Paul Ryan
1,297 762
37.3% 58.8%
535762
both candidates
4,149 515
65.6% 12.4%
3634515

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 2
commentary
Ryan saying more, using speech more distinctive than Biden.

Ryan's fraction of non-stop words is largest at 45.2%, +5.6% (45.2 vs 42.8) than Biden.

Biden had significantly fewer exclusive non-stop words than Romney, -24.1% (578 vs 762). Compare this to Obama's, who had +6.8% (597 vs 559) more exclusive words than Romney in the first debate. The spread of exclusive words between Biden/Ryan and Obama/Ryan is quite different. Whereas Obama and Romney distinguished themselves with roughly the same number of words (597 vs 559), Ryan had a great deal more than Biden (762 vs 578).

Table 2
legend
a c
b d
3010

a :: total number of words, for a given category (all, stop, non-stop)

b :: (a) relative to words in the debate if category=all, otherwise relative to words by the candidate

c :: number of unique words with set (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

All further word use statistics represent content that has been filtered for stop words.

Word frequency

The word frequency table summarizes the frequency with which words were used. I show the average word frequency and the weighted cumulative frequencies at 50 and 90 percentile. The average word frequency indicates how many times, on average, a word is used. For a given fraction of the entire delivery, the weighted cumulative frequency indicates the largest word frequency within this fraction (details about weighted cumulative distribution).

Table 3a
word use frequency
Average and 50%/90% percentile word frequencies.
speaker word frequency
all stop non-stop
Joe Biden
5.4 20 158
5.36920.000158.000
26.7 71 388
26.70471.000388.000
2.6 4 18
2.5974.00018.000
Paul Ryan
5.4 23 201
5.43223.000201.000
29.8 85 256
29.76885.000256.000
2.7 4 22
2.7264.00022.000
total
7.1 42 337
7.12342.000337.000
50.8 143 438
50.753143.000438.000
3.4 7 35
3.4077.00035.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 3b
exclusive and shared non-stop word use frequency
Average and 50%/90% cumulative percentile word frequencies. Non-stop words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word frequency
Joe Biden
1.51 2 5
1.5122.0005.000
Paul Ryan
1.70 2 8
1.7022.0008.000
total
3.41 7 35
3.4077.00035.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 3
commentary
Biden low on repetion. Ryan hammers on exclusive words.

Neither Biden, who was least likely to reuse words with a non-stop average frequency of 2.6, nor Ryan approach Romney's average word frequency of 3.2.

Ryan liked to repeat words that were exclusive to him, with an average frequency +12.6% (1.7 vs 1.51) higher than Biden.

Table 3
legend
a b c
51025

a :: average word frequency

b :: largest word frequency in 50% of content

c :: largest word frequency in 90% of content

bar :: proportion of a:b:c

Sentence Size

Table 4
sentence size
Number of sentences spoken by each speaker and sentence word count statistics. Number of words in a sentence is shown by average and 50%/90% cumulative values for all, stop and non-stop words.
speaker number of sentences sentence size
all stop non-stop
Joe Biden
532
532
12.5 19 47
12.46419.00047.000
7.5 11 27
7.45011.00027.000
5.5 8 21
5.4818.00021.000
Paul Ryan
636
636
12.1 17 33
12.11917.00033.000
6.9 9 18
6.8519.00018.000
5.6 8 15
5.5878.00015.000
total
1,168
1168
14.3 18 39
14.27718.00039.000
9.1 11 23
9.12211.00023.000
7.5 9 19
7.5399.00019.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 4
commentary
Both candidates' sentences shorter than those of presidential candidates. Ryan's speech terse.

Both candidates' sentences were short (Biden 5.5, Ryan 5.6 non-stop words/sentence). Compare these numbers to those of Obama, whose sentences had more structure with an average length of 8.4 non-stop words.

Biden's longest sentence was 91 words. Not quite Obama's 112-word whopper, but close.

Biden, 91 words — We need more, but 5.2 million - if they'd get out of the way, if they'd get out of the way and let us pass the tax cut for the middle class, make it permanent, if they get out of the way and pass the - pass the jobs bill, if they get out of the way and let us allow 14 million people who are struggling to stay in their homes because their mortgages are upside down, but they never missed a mortgage payment, just get out of the way.

Ok, that was not a fair sentence. He was obviously blabbering. Let's look at his next largest, at 90 words.

Biden, 90 words — And instead of signing pledges to Grover Norquist not to ask the wealthiest among us to contribute to bring back the middle class, they should be signing a pledge saying to the middle class we're going to level the playing field; we're going to give you a fair shot again; we are going to not repeat the mistakes we made in the past by having a different set of rules for Wall Street and Main Street, making sure that we continue to hemorrhage these tax cuts for the super wealthy.

Ryan on the other hand had a relatively terse longest sentence.

Ryan, 44 words — But we want to see the 2014 transition be successful, and that means we want to make sure our commanders have what they need to make sure that it is successful so that this does not once again become a launching pad for terrorists.

Table 4
legend
a b c
51025

a :: average sentence size

b :: largest sentence size for 50% of content

c :: largest sentence size for 90% of content

bar :: proportion of a:b:c

Part of Speech Analysis

In this section, word frequency is broken down by their part of speech (POS). The four POS groups examined are nouns, verbs, adjectives and adverbs. Conjunctions and prepositions are not considered. The first category (n+v+adj+adv) is composed of all four POS groups.

Part of Speech Count

Table 5
part of speech count
Count of words categorized by part of speech (POS).
part of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
2,586 1,031
39.0% 39.9%
7735344083531582485755
1,307 534
50.5% 40.9%
773534
761 353
29.4% 46.4%
408353
406 248
15.7% 61.1%
158248
112 55
4.3% 49.1%
5755
Paul Ryan
3,142 1,212
40.8% 38.6%
9226315434032422806853
1,553 631
49.4% 40.6%
922631
946 403
30.1% 42.6%
543403
522 280
16.6% 53.6%
242280
121 53
3.9% 43.8%
6853
total
5,728 1,762
39.9% 30.8%
1934926108961848644215380
2,860 926
49.9% 32.4%
1934926
1,707 618
29.8% 36.2%
1089618
928 442
16.2% 47.6%
486442
233 80
4.1% 34.3%
15380

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 5
commentary
Large spread in relative unique adjectives - Biden more diverse.

Ryan pounced on Biden with greater total number of unique nouns, verbs and adjectives. Relatively, Biden had a higher fraction of words across all parts of speech. He particularly distinguished himself with more diverse use of adjectives.

If you compare the relative unique word percent spreads to the first presidential debate, the difference is stark. Whereas Obama and Romney had their relative unique word fractions within 2-3% for each part of speech, the difference between Biden and Ryan grows to as much as 7.5% for adjectives.

Table 5
legend
a c
b d
1535

a :: total number of words for a given POS (all, noun, verb, adjective, adverb, pronoun)

b :: (a) relative to all words by candidate

c :: unique words in (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

Part of Speech Frequency

Table 5
part of speech frequency
Frequency of words categorized by part of speech (POS).
part of speech frequency
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
2.51 4 18
2.5084.00018.000
2.45 4 16
2.4484.00016.000
2.16 3 13
2.1563.00013.000
1.64 2 6
1.6372.0006.000
2.04 3 13
2.0363.00013.000
Paul Ryan
2.59 4 23
2.5924.00023.000
2.46 4 17
2.4614.00017.000
2.35 3 23
2.3473.00023.000
1.86 2 7
1.8642.0007.000
2.28 3 14
2.2833.00014.000
total
3.25 6 36
3.2516.00036.000
3.09 5 30
3.0895.00030.000
2.76 5 35
2.7625.00035.000
2.10 3 12
2.1003.00012.000
2.91 5 27
2.9135.00027.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 5
commentary
Adverbs repeated less often than verbs.

Noun repetition was similar for both candidates. Ryan repeated verbs, adjective and adverbs at levels consistently higher than Biden. Both candidates repeated adverbs less frequently than their verbs, which does not hold true for Obama or Romney.

Table 5
legend
a b c
51025

a :: average word frequency

b :: largest word frequency in 50% of content

c :: largest word frequency in 90% of content

bar :: proportion of a:b:c

Part of Speech Pairing

Through word pairing, I extract concepts from the text. The number of unique word pairs is a function of sentence length and is one of the measures of complexity.

Table 6a
part of speech pairing — Joe Biden
Word pairs (total and unique) categorized by part of speech (POS)
part of speech pairings - Joe Biden
noun verb adjective adverb
noun
3,612 2,810
  77.8%
8022810
verb
3,850 3,158
  82.0%
6923158
917 782
  85.3%
135782
adjective
1,542 1,300
  84.3%
2421300
826 733
  88.7%
93733
160 142
  88.8%
18142
adverb
581 501
  86.2%
80501
325 285
  87.7%
40285
131 126
  96.2%
5126
20 17
  85.0%
317

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6b
part of speech pairing — Paul Ryan
Word pairs (total and unique) categorized by part of speech (POS)
part of speech pairings - Paul Ryan
noun verb adjective adverb
noun
2,956 2,475
  83.7%
4812475
verb
3,337 2,852
  85.5%
4852852
877 724
  82.6%
153724
adjective
1,706 1,467
  86.0%
2391467
977 830
  85.0%
147830
241 220
  91.3%
21220
adverb
399 369
  92.5%
30369
239 202
  84.5%
37202
118 107
  90.7%
11107
18 17
  94.4%
117

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6c
unique part of speech pairing — candidate comparison
Unique word pairs categorized by part of speech (POS)
unique part of speech pairings
noun (n) verb (v) adjective (adj) adverb (adv)
noun
2,810 2,475
  88.1%
2810
2475
verb
3,158 2,852
  90.3%
3158
2852
782 724
  92.6%
782
724
adjective
1,300 1,467
  112.8%
1300
1467
733 830
  113.2%
733
830
142 220
  154.9%
142
220
adverb
501 369
  73.7%
501
369
285 202
  70.9%
285
202
126 107
  84.9%
126
107
17 17
  100.0%
17
17

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6
commentary
Ryan more likely to combine adjectives. Biden forms more complex sentences with adverbs.

Ryan had a consistently higher number of adjective/noun, adjective/verb and adjective/pairings than Biden, suggesting that he inserted them into more sentences than Biden. Biden had more adjective/adverb combinations, which indicate sentences in which both concept and action are modified.

Table 6 a,b
legend
a c
  d
3010

a :: total number of pairs, for a given category (e.g. verb/noun)

c :: number of unique pairs within set (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

Table 6c
legend
a c
  d
50
45

a :: unique pairs for Joe Biden

c :: unique pairs for Paul Ryan

d :: (c) relative to (a) (i.e. Paul Ryan relative to Joe Biden)

bars :: (a) and (c)

Exclusive and Shared Usage

This section enumerates words that were exclusive to a candidate (e.g. used by one candidate but not the other). This content provides insight into what the candidates' priorities are and reveals differences in perspective on similar topics.

For a given part of speech, the table breaks down the number of words that were spoken by only one of the candidates or both candidates (intersection). The last row includes words spoken by either candidate (union).

Table 7
exclusive word usage
Total and unique words used exclusively by a candidate, or by both.
part of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Joe Biden
824 550
100.0% 66.7%
14.4% 31.2%
274550
1452515818630129322
396 251
48.1% 63.4%
13.8% 27.1%
145251
145251
244 186
29.6% 76.2%
14.3% 30.1%
58186
58186
159 129
19.3% 81.1%
17.1% 29.2%
30129
30129
25 22
3.0% 88.0%
10.7% 27.5%
322
322
Paul Ryan
1,216 731
100.0% 60.1%
21.2% 41.5%
485731
237367111237781531023
604 367
49.7% 60.8%
21.1% 39.6%
237367
237367
348 237
28.6% 68.1%
20.4% 38.3%
111237
111237
231 153
19.0% 66.2%
24.9% 34.6%
78153
78153
33 23
2.7% 69.7%
14.2% 28.8%
1023
1023
both candidates
3,688 481
100.0% 13.0%
64.4% 27.3%
3207481
15092398991383578613828
1,748 239
47.4% 13.7%
61.1% 25.8%
1509239
1509239
1,037 138
28.1% 13.3%
60.7% 22.3%
899138
899138
443 86
12.0% 19.4%
47.7% 19.5%
35786
35786
166 28
4.5% 16.9%
71.2% 35.0%
13828
13828
total
5,728 1,762
100.0% 30.8%
100.0% 100.0%
39661762
1934926108961848644215380
2,860 926
49.9% 32.4%
100.0% 100.0%
1934926
1934926
1,707 618
29.8% 36.2%
100.0% 100.0%
1089618
1089618
928 442
16.2% 47.6%
100.0% 100.0%
486442
486442
233 80
4.1% 34.3%
100.0% 100.0%
15380
15380

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 7
commentary
Candidates share fewer nouns and adjectives than presidential counterparts.

With the exception of adverbs, Ryan had significantly larger total exclusive words for each part of speech. Both candidates shared 25.8% of nouns, lower than 29.3% that we saw with Romney and Obama in their first debate. Shared verb and adverb fractions were similar. Much more exclusivity in adjectives was seen — only 19.5% of adjectives were shared, compared to 27.4% in the frist presidential debate.

Table 7c
legend
a d
b e
c f
4030
40302015105

a :: total number of words in set (e.g. obama \ romney, obama ∩ romney, obama ∪ romney , for a given part of speech

b :: (a) relative to all exclusive words in n+v+adj+adv

c :: (a) relative to all words in n+v+adj+adv

d :: unique words in (a)

e :: (d) relative to (a)

f :: (d) relative to all unique words in n+v+adj+adv

bar1 :: normalized ratio of (a-d):d

bar2 :: absolute ratio of (a-d):d for all POS groups (first column) or POS group (other columns)

Noun Phrase Usage

Noun phrases were extracted from the text and analyzed for frequency, word count, unique word count and richness. Single-word phrases were not counted.

Top-level noun phrases are those without a parent noun phrase (a parent phrase is one that a similar, longer phrase). Derived noun phrases are those with a parent (more details about noun phrase analysis).

The top-level noun phrases can be interpreted as independent concepts. Derived noun phrases can be interpreted as variants on concepts embodied by the top-level phrases.

Noun Phrase Count and length

This table reports the absolute number of noun phrases, which is related to the number of nouns, and their length.

Table 8a
noun phrase count
Counts of noun phrases in words and per noun.
speaker noun phrase count
all top-level
Joe Biden
436 240
100.0% 55.0%
0.33 0.45
196240
381 233
87.4% 61.2%
0.29 0.44
148233
Paul Ryan
591 298
100.0% 50.4%
0.38 0.47
293298
497 294
84.1% 59.2%
0.32 0.47
203294

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 8b
noun phrase length
Average and 50%/90% cumulative length of noun phrases, in words.
speaker noun phrase length
all top-level
Joe Biden
2.28 2 3
2.2782.0003.000
2.31 2 3
2.3102.0003.000
Paul Ryan
2.23 2 3
2.2282.0003.000
2.27 2 3
2.2722.0003.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 8
commentary
Ryan has much greater share of exclusive noun phrases.

Biden delivered almost the identical number of top-level noun phrases as Obama (233 vs 234). Ryan's speech was terse with 0.47 unique noun phrases per unique noun (compare Romney at 0.42 and Obama at 0.43).

Table 8a
legend
a d
b e
c f
1070

a :: number of noun phrases

b :: (a) relative to number of all noun phrases

c :: number of noun phrases per noun

d :: number of unique phrases

e :: (c) relative to (a)

f :: number of unique noun phrases per unique noun

bar :: normalized ratio of (a-c):c

Table 8b
legend
a b c
102080

a :: average noun phrase size, in words

b :: largest noun phrase size in 50% of content

c :: largest noun phrase size in 90% of content

bar :: proportion of a:b:c


Exclusive and Shared Noun Phrase Count and length

Table 9a
exclusive and shared noun phrase count
Counts of exclusive and shared noun phrases in words and per noun.
speaker noun phrase count
all top-level
Joe Biden
367 226
34.0% 61.6%
141226
354 222
96.5% 62.7%
132222
Paul Ryan
509 285
47.2% 56.0%
224285
475 284
93.3% 59.8%
191284
both candidates
172 31
16.0% 18.0%
14131
74 18
43.0% 24.3%
5618

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 9b
exclusive and shared noun phrase length
Average and 50%/90% cumulative length of noun phrases, in words.
speaker noun phrase length
all top-level
Joe Biden
2.32 2 4
2.3222.0004.000
2.33 2 4
2.3252.0004.000
Paul Ryan
2.26 2 3
2.2592.0003.000
2.28 2 3
2.2782.0003.000
both candidates
2.08 2 3
2.0762.0003.000
2.18 2 3
2.1762.0003.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 9
commentary

Whereas Obama and Romney had a similar number of exclusive top-level noun phrases in their first debate (226 vs 224), and Biden's level was similar (222), Ryan had +27.9% (284 vs 222), more than Biden.

Table 9a
legend
a c
b d
1070

a :: number of noun phrases

b :: (a) relative to number of all noun phrases

c :: number of unique phrases

d :: (c) relative to (a)

bar :: normalized ratio of (a-c):c

Table 9b
legend
a b c
102080

a :: average noun phrase size, in words

b :: largest noun phrase size in 50% of content

c :: largest noun phrase size in 90% of content

bar :: proportion of a:b:c


Windbag Index

The Windbag Index is a compound measure that characterizes the complexity of speech. A low index is indicative of succinct speech with low degree of repetition and large number of independent concepts.

Table 10
windbag index
Windbag Index for each speaker. The higher the value, the more repetitive the speech.
speaker Windbag Index
index value index terms
Joe Biden
199
-33.1%
199.69325777166
0.428 0.385 0.409 0.464 0.611 0.491 0.550 0.971
-5.2% +4.9% +0.6% +8.9% +13.9% +12.1% +9.2% -1.6%
0.4281405519529480.3849947164494540.4085692425401680.4638633377135350.6108374384236450.4910714285714290.550458715596330.970833333333333
Paul Ryan
298
+49.4%
298.360357123066
0.452 0.367 0.406 0.426 0.536 0.438 0.504 0.987
+5.5% -4.7% -0.6% -8.2% -12.2% -10.8% -8.4% +1.6%
0.4516087182148420.3668486067222060.4063103670315520.426004228329810.536398467432950.438016528925620.5042301184433160.986577181208054
Table 10
commentary
Biden consistently less repetitive than Ryan.

Biden's Windbag Index is extremely low, -33.2% (199 vs 298) lower than Ryan. Obama and Romney had values of 468 and 685 in their first debate. The two factors for Biden that contributed to a higher component than Ryan was the fraction of non-stop words (incidentally, this was also the fraction for which Obama's value was lower) and the fraction of noun phrases that were top-level.

Table 10
legend
The Windbag Index is 1/(t1*t2*...*t9) where t1,t2,...,t8 are

t1 :: fraction of words which are non-stop

t2 :: fraction of non-stop words which are unique

t3 :: fraction of nouns which are unique

t4 :: fraction of verbs which are unique

t5 :: fraction of adjectives which are unique

t6 :: fraction of adverbs which are unique

t7 :: fraction of noun phrases which are unique

t8 :: fraction of noun phrases which are top-level


Large individual terms t1...t9 contribute to a smaller index.

The percentage values below the index and each term are relative differences to the other speaker's corresponding term (i.e. 100*(a-b)/b where a is the value for one speaker and b for the other).

Word Clouds

In the word clouds below, the size of the word is proportional to the number of times it was used by a candidate (method details).

Not all words from a group used to draw the cloud fit in the image — less frequently used words for large word groups may fall outside the image.

All Words for Each Candidate

Each candidate's debate portion was extracted and frequencies were compiled for each part of speech (noun, verb, adjective, adverb), with words colored by their part of speech category.

The distribution of sizes within a tag cloud follows the frequency distribution of words. However, word size cannot be compared between clouds, since the minimum and maximum size of the words is fixed.

Debate Word Cloud for Joe Biden - all words

Debate tag cloud for Joe Biden

Debate Word Cloud for Paul Ryan - all words

Debate tag cloud for Paul Ryan
commentary

Biden's focus is "american" and "middle-class". For Ryan, it's the generic "people".

Exclusive Words for Each Candidate

The clouds below show words used exlusively by a candidate. For example, if candidate A used the word "invest" (any number of times), but candidate B did not, then the word will appear in the exclusive word tag cloud for candidate A.

Words exclusive to Joe Biden

Debate tag cloud for Joe Biden

Words exclusive to Paul Ryan

Debate tag cloud for Paul Ryan
commentary

Biden has "fact", "friend" and modifies with "particularly" and "mostly". Ryan fulfils the Republican fearmongering agenda with "nuclear", "lose", "failed" and "attack". Recall that Romney had similarly devastating words ("lose, "hurt" and "crushed").

Part of Speech Word Clouds

In these clouds, words from each major part of speech were colored based on whether they were exclusive to a candidate or shared by the candidates.

The size of the word is relative to the frequency for the candidate — word sizes between candidates should not be used to indicate difference in absolute frequency.

Cloud of noun words, by speaker

commentary

Biden thinks reality is a buddy, with "fact" and "friend" being prominent. Ryan's language is combative with "weapons", "marine" and "terrorists". Ryan does say "youtube", which means he has the Google.

Cloud of verb words, by speaker

commentary

Biden uses "love", which is wonderful, and ryan has "failed", "try", "lose" and "run".

Cloud of adjective words, by speaker

commentary

When Ryan mentions "lower", it's not about expectations. Tax, surely.

Cloud of adverb words, by speaker

commentary

Ryan has "pretty" and "safely", slightly unusual words given the threatening nature of his frequently used verbs and nouns. His "faster" is matched by Romney's "always".

Cloud of all words, by speaker

commentary

Relatively few exclusive words in the center are surrounded by a cloud of shared (grey) words. Quite a diffent view than the first presidential debate, for which this cloud was full of large words with fewer grey contributions.

Word Pair Clouds for Each Candidate

word pairs for Joe Biden

^ adjective/adjective by Joe Biden
^ adjective/adverb by Joe Biden
^ adjective/noun by Joe Biden
^ adjective/verb by Joe Biden
^ adverb/adverb by Joe Biden
^ adverb/noun by Joe Biden
^ adverb/verb by Joe Biden
^ noun/noun by Joe Biden
^ noun/verb by Joe Biden
^ verb/verb by Joe Biden

word pairs for Paul Ryan

^ adjective/adjective by Paul Ryan
^ adjective/adverb by Paul Ryan
^ adjective/noun by Paul Ryan
^ adjective/verb by Paul Ryan
^ adverb/adverb by Paul Ryan
^ adverb/noun by Paul Ryan
^ adverb/verb by Paul Ryan
^ noun/noun by Paul Ryan
^ noun/verb by Paul Ryan
^ verb/verb by Paul Ryan
commentary

Biden mixes "believe" a lot, which shows up in the verb pairs as "believe free", "aid believe" and "making believe". The last pair sounds vaguely more like Romney than Biden.

Ryan fulfils the mandate to spread fear with "terrorist attack". "Infringing catholic" comes from his sentence

They're infringing upon our first freedom, the freedom of religion, by infringing on Catholic charities, Catholic churches, Catholic hospitals.

Downloads

Debate transcript

Parsed word lists (word lists, part of speech lists, noun phrases, sentences)

Word clouds

Raw data structure

Please see the methods section for details about these files.