Word Analysis of 2016 Presidential Debates — Clinton vs Trump by Martin Krzywinski | projects contact

Chrome no longer supports Java NPAPI. To create Wordles directly from debate analysis tables, use Firefox or Safari or Explorer.

home > results and commentary > Clinton vs Trump (1st debate)

Word Analysis of Hillary Clinton vs Donald Trump (1st debate)

Introduction

Word Statistics

Debate Word Count

Summary Word Count

The summary word count reports the total number of words and the number of unique, non-stop words used by each candidate. Word number is expressed as both absolute and relative values.

Table 1a
all words
Number of all words and unique words used by each speaker.
set word count
Hillary Clinton
6,226 1,328
43.9% 21.3%
48981328
Donald Trump
7,941 1,176
56.1% 14.8%
67651176
total
14,167 1,915
100.0% 13.5%
122521915

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 1b
exclusive and shared words
Words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word count
Hillary Clinton
1,048 739
16.8% 70.5%
309739
Donald Trump
1,130 587
14.2% 51.9%
543587
both candidates
11,989 589
84.6% 4.9%
11400589

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 1
legend
a c
b d
3010

a :: word count

b :: word count, as fraction in total in debate

c :: unique words in (a)

d :: unique words in (a), as fraction in (a) bar :: proportion of (a-c):c

Table 1
commentary

Trump dominated the debate in terms of time and total word count. 56% of the debate's words were delivered by Trump.

His unique word count, however, was lower than Clinton's both in absolute and relative terms. He delivered 1,176 unique words (14.8% of his total word count) to Clinton's 1,328 (21.3% of her total word count).

Clinton used more words that Trump did not. She delivered a total of 1,048 words that her opponent did not say, 70.5% of these being unique. Trump said 1,130 words that Clinton did not use and only 51.9% of these were unique. Using the number of these words as a measure, Clinton distinguished herself better from her opponent.

Stop Word Contribution

In the table below, the candidates' delivery is partitioned into stop and non-stop words. Stop words (full list) are frequently-used bridging words (e.g. pronouns and conjunctions) whose meaning depends entirely on context. The fraction of words that are stop words is one measure of the complexity of speech.

Table 2a
non-stop words
Counts of stop and non-stop words.
speaker all stop non-stop
Hillary Clinton
6,226 1,328
100.0% 21.3%
48981328
3,519 137
56.5% 3.9%
3382137
2,707 1,191
43.5% 44.0%
15161191
Donald Trump
7,941 1,176
100.0% 14.8%
67651176
4,673 142
58.8% 3.0%
4531142
3,268 1,034
41.2% 31.6%
22341034
total
14,167 1,915
100.0% 13.5%
122521915
8,192 148
57.8% 1.8%
8044148
5,975 1,767
42.2% 29.6%
42081767

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 2b
exclusive and shared non-stop words
Non-stop words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word count
Hillary Clinton
1,038 733
38.3% 70.6%
305733
Donald Trump
1,102 576
33.7% 52.3%
526576
both candidates
3,835 458
64.2% 11.9%
3377458

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 2
legend
a c
b d
3010

a :: total number of words, for a given category (all, stop, non-stop)

b :: (a) relative to words in the debate if category=all, otherwise relative to words by the candidate

c :: number of unique words with set (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

Table 2
commentary

Stop words are words like "and", "of" and "in" (full list). The fraction of stop words for both Clinton and Trump was similar—about 56-59%.

Clinton delivered more non-stop content. 43.5% of her words were non-stop, compared to Trump's 41.2%. Furthermore, unique words made up 44% of these, as opposed to Trump's 31.6%.

Word frequency

The word frequency table summarizes the frequency with which words were used. I show the average word frequency and the weighted cumulative frequencies at 50 and 90 percentile. The average word frequency indicates how many times, on average, a word is used. For a given fraction of the entire delivery, the weighted cumulative frequency indicates the largest word frequency within this fraction (details about weighted cumulative distribution).

Table 3a
word use frequency
Average and 50%/90% percentile word frequencies.
speaker word frequency
all stop non-stop
Hillary Clinton
4.7 19 205
4.68819.000205.000
25.7 65 236
25.68665.000236.000
2.3 4 17
2.2734.00017.000
Donald Trump
6.8 28 220
6.75328.000220.000
32.9 74 261
32.90874.000261.000
3.2 5 25
3.1615.00025.000
total
7.4 42 466
7.39842.000466.000
55.4 135 475
55.351135.000475.000
3.4 7 41
3.3817.00041.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 3b
exclusive and shared non-stop word use frequency
Average and 50%/90% cumulative percentile word frequencies. Non-stop words exclusive to speaker (e.g. speaker A but not speaker B) and shared by speakers (speaker A and B).
set word frequency
Hillary Clinton
1.42 1 4
1.4161.0004.000
Donald Trump
1.91 2 8
1.9132.0008.000
total
3.38 7 41
3.3817.00041.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 3
legend
a b c
51025

a :: average word frequency

b :: largest word frequency in 50% of content

c :: largest word frequency in 90% of content

bar :: proportion of a:b:c

Table 3
commentary

This is an interesting table and hints at how repetitive a debater's delivery is. If we exclude stop words, then Clinton repeated a word on average 2.3 times. Trump on the other hand, repeated a word on average 3.2 times.

When we look at the content exclusive to each candidate, Clinton repeated words that Trump did not use only 1.4 times. Trump, on the other hand, repeated the words that Clinton did not use about 2 times.

Sentence Size

Table 4
sentence size
Number of sentences spoken by each speaker and sentence word count statistics. Number of words in a sentence is shown by average and 50%/90% cumulative values for all, stop and non-stop words.
speaker number of sentences sentence size
all stop non-stop
Hillary Clinton
429
429
14.6 19 36
14.58319.00036.000
8.3 11 21
8.32111.00021.000
6.5 8 17
6.5118.00017.000
Donald Trump
708
708
11.2 15 38
11.23715.00038.000
6.8 9 24
6.8429.00024.000
4.7 7 17
4.7077.00017.000
total
1,137
1137
14.5 18 37
14.50018.00037.000
9.4 11 24
9.40811.00024.000
7.4 9 17
7.3859.00017.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 4
legend
a b c
51025

a :: average sentence size

b :: largest sentence size for 50% of content

c :: largest sentence size for 90% of content

bar :: proportion of a:b:c

Table 4
commentary

Clinton's average sentence length was about 15 words. About 6.5 of those words were non-stop words. She delivered 429 sentences.

Trump delivered +65.0% (708 vs 429) more sentences and their non-stop content was –27.7% (4.7 vs 6.5) shorter than Clinton's. This was apparent during the debate—Trump would start a sentence without finishing the previous one.

Here the analysis heavily depends on the punctuation in the transcript.

All further word use statistics represent content that has been filtered for stop words, unless explicitly indicated.

Part of Speech Analysis

In this section, word frequency is broken down by their part of speech (POS). The four POS groups examined are nouns, verbs, adjectives and adverbs. Conjunctions and prepositions are not considered. The first category (n+v+adj+adv) is composed of all four POS groups.

Part of Speech Count

Table 5
part of speech count
Count of words categorized by part of speech (POS).
part of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Hillary Clinton
2,514 1,135
40.4% 45.1%
5585583923892022599462
1,116 558
44.4% 50.0%
558558
781 389
31.1% 49.8%
392389
461 259
18.3% 56.2%
202259
156 62
6.2% 39.7%
9462
Donald Trump
2,992 986
37.7% 33.0%
82750654231832327515051
1,333 506
44.6% 38.0%
827506
860 318
28.7% 37.0%
542318
598 275
20.0% 46.0%
323275
201 51
6.7% 25.4%
15051
total
5,506 1,694
38.9% 30.8%
1564885106957260345627186
2,449 885
44.5% 36.1%
1564885
1,641 572
29.8% 34.9%
1069572
1,059 456
19.2% 43.1%
603456
357 86
6.5% 24.1%
27186

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 5
legend
a c
b d
1535

a :: total number of words for a given POS (all, noun, verb, adjective, adverb, pronoun)

b :: (a) relative to all words by candidate

c :: unique words in (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

Table 5
commentary

Trump delivered –13.1% (986 vs 1,135) fewer total unique nouns, verbs, adjectives and adverbs. Both candidates had a similar relative ratio of each part of speech, about a 44:30:20:6 ratio.

However, Clinton delivered +10.3% (558 vs 506) more unique nouns and +22.3% (389 vs 318) more unique verbs. This suggests that her content had more ideas and more actions.

Trump slightly edged Clinton by +6.2% (275 vs 259) on the use of adjectives.

Part of Speech Frequency

Table 5
part of speech frequency
Frequency of words categorized by part of speech (POS).
part of speech frequency
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Hillary Clinton
2.21 3 17
2.2153.00017.000
2.00 3 15
2.0003.00015.000
2.01 3 21
2.0083.00021.000
1.78 2 8
1.7802.0008.000
2.52 4 19
2.5164.00019.000
Donald Trump
3.03 5 26
3.0345.00026.000
2.63 4 16
2.6344.00016.000
2.70 4 27
2.7044.00027.000
2.17 3 13
2.1753.00013.000
3.94 8 39
3.9418.00039.000
total
3.25 6 37
3.2506.00037.000
2.77 5 26
2.7675.00026.000
2.87 5 50
2.8695.00050.000
2.32 4 17
2.3224.00017.000
4.15 12 55
4.15112.00055.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 5
legend
a b c
51025

a :: average word frequency

b :: largest word frequency in 50% of content

c :: largest word frequency in 90% of content

bar :: proportion of a:b:c

Table 5
commentary

We've already seen that Trump repeated his words more. This table looks at how this repetition breaks down by part of speech.

For example, Clinton repeated her nouns –24.0% (2 vs 2.63) less and her verbs –25.6% (2.01 vs 2.7) less than Trump.

Trump had a very high repetition rate of adverbs, +56.3% (3.94 vs 2.52) higher than Clinton.

Part of Speech Pairing

Through word pairing, I extract concepts from the text. The number of unique word pairs is a function of sentence length and is one of the measures of complexity.

Table 6a
part of speech pairing — Hillary Clinton
Word pairs (total and unique) categorized by part of speech (POS)
part of speech pairings - Hillary Clinton
noun verb adjective adverb
noun
2,171 1,948
  89.7%
2231948
verb
2,953 2,685
  90.9%
2682685
933 857
  91.9%
76857
adjective
1,620 1,477
  91.2%
1431477
1,077 984
  91.4%
93984
308 284
  92.2%
24284
adverb
632 593
  93.8%
39593
430 402
  93.5%
28402
242 227
  93.8%
15227
40 38
  95.0%
238

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6b
part of speech pairing — Donald Trump
Word pairs (total and unique) categorized by part of speech (POS)
part of speech pairings - Donald Trump
noun verb adjective adverb
noun
2,514 2,071
  82.4%
4432071
verb
2,694 2,218
  82.3%
4762218
715 554
  77.5%
161554
adjective
1,801 1,519
  84.3%
2821519
963 820
  85.2%
143820
309 265
  85.8%
44265
adverb
602 505
  83.9%
97505
382 307
  80.4%
75307
231 196
  84.8%
35196
35 25
  71.4%
1025

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6c
unique part of speech pairing — candidate comparison
Unique word pairs categorized by part of speech (POS)
unique part of speech pairings
noun (n) verb (v) adjective (adj) adverb (adv)
noun
1,948 2,071
  106.3%
1948
2071
verb
2,685 2,218
  82.6%
2685
2218
857 554
  64.6%
857
554
adjective
1,477 1,519
  102.8%
1477
1519
984 820
  83.3%
984
820
284 265
  93.3%
284
265
adverb
593 505
  85.2%
593
505
402 307
  76.4%
402
307
227 196
  86.3%
227
196
38 25
  65.8%
38
25

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 6 a,b
legend
a c
  d
3010

a :: total number of pairs, for a given category (e.g. verb/noun)

c :: number of unique pairs within set (a)

d :: (c) relative to (a)

bar :: proportion of (a-c):c

Table 6c
legend
a c
  d
50
45

a :: unique pairs for Hillary Clinton

c :: unique pairs for Donald Trump

d :: (c) relative to (a) (i.e. Donald Trump relative to Hillary Clinton)

bars :: (a) and (c)

Table 6
commentary

Clinton had +21.1% (2,685 vs 2,218) more unique noun-verb pairings than Trump and +54.7% (857 vs 554) more unique verb-verb pairings. This suggests that she spoke more in terms of action concepts and compound actions.

Exclusive and Shared Usage

This section enumerates words that were exclusive to a candidate (e.g. used by one candidate but not the other). This content provides insight into what the candidates' priorities are and reveals differences in perspective on similar topics.

For a given part of speech, the table breaks down the number of words that were spoken by only one of the candidates or both candidates (intersection). The last row includes words spoken by either candidate (union).

Table 7
exclusive word usage
Total and unique words used exclusively by a candidate, or by both.
part of speech
n+v+adj+adv nouns (n) verbs (v) adjectives (adj) adverbs (adv)
Hillary Clinton
995 708
100.0% 71.2%
18.1% 41.8%
287708
1133486622451157630
461 348
46.3% 75.5%
18.8% 39.3%
113348
113348
290 224
29.1% 77.2%
17.7% 39.2%
66224
66224
208 157
20.9% 75.5%
19.6% 34.4%
51157
51157
36 30
3.6% 83.3%
10.1% 34.9%
630
630
Donald Trump
1,053 559
100.0% 53.1%
19.1% 33.0%
494559
240283122158691491121
523 283
49.7% 54.1%
21.4% 32.0%
240283
240283
280 158
26.6% 56.4%
17.1% 27.6%
122158
122158
218 149
20.7% 68.3%
20.6% 32.7%
69149
69149
32 21
3.0% 65.6%
9.0% 24.4%
1121
1121
both candidates
3,458 427
100.0% 12.3%
62.8% 25.2%
3031427
11681798571354427825127
1,347 179
39.0% 13.3%
55.0% 20.2%
1168179
1168179
992 135
28.7% 13.6%
60.5% 23.6%
857135
857135
520 78
15.0% 15.0%
49.1% 17.1%
44278
44278
278 27
8.0% 9.7%
77.9% 31.4%
25127
25127
total
5,506 1,694
100.0% 30.8%
100.0% 100.0%
38121694
1564885106957260345627186
2,449 885
44.5% 36.1%
100.0% 100.0%
1564885
1564885
1,641 572
29.8% 34.9%
100.0% 100.0%
1069572
1069572
1,059 456
19.2% 43.1%
100.0% 100.0%
603456
603456
357 86
6.5% 24.1%
100.0% 100.0%
27186
27186

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 7c
legend
a d
b e
c f
4030
40302015105

a :: total number of words in set (e.g. obama \ romney, obama ∩ romney, obama ∪ romney , for a given part of speech

b :: (a) relative to all exclusive words in n+v+adj+adv

c :: (a) relative to all words in n+v+adj+adv

d :: unique words in (a)

e :: (d) relative to (a)

f :: (d) relative to all unique words in n+v+adj+adv

bar1 :: normalized ratio of (a-d):d

bar2 :: absolute ratio of (a-d):d for all POS groups (first column) or POS group (other columns)

Table 7
commentary

The words that Clinton used that Trump did not use were more diverse in every part of speech.

For example, she used +23.0% (348 vs 283) more nouns that Trump did not use, +41.8% (224 vs 158) more verbs, +5.4% (157 vs 149) more adjectives and +42.9% (30 vs 21) more verbs.

Noun Phrase Usage

Noun phrases were extracted from the text and analyzed for frequency, word count, unique word count and richness. Single-word phrases were not counted.

Top-level noun phrases are those without a parent noun phrase (a parent phrase is one that a similar, longer phrase). Derived noun phrases are those with a parent (more details about noun phrase analysis).

The top-level noun phrases can be interpreted as independent concepts. Derived noun phrases can be interpreted as variants on concepts embodied by the top-level phrases.

Noun Phrase Count and length

This table reports the absolute number of noun phrases, which is related to the number of nouns, and their length.

Table 8a
noun phrase count
Counts of noun phrases in words and per noun.
speaker noun phrase count
all top-level
Hillary Clinton
407 237
100.0% 58.2%
0.00 0.00
170237
349 224
85.7% 64.2%
0.00 0.00
125224
Donald Trump
471 224
100.0% 47.6%
0.00 0.00
247224
395 220
83.9% 55.7%
0.00 0.00
175220

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 8b
noun phrase length
Average and 50%/90% cumulative length of noun phrases, in words.
speaker noun phrase length
all top-level
Hillary Clinton
2.36 2 4
2.3642.0004.000
2.41 2 4
2.4132.0004.000
Donald Trump
2.29 2 3
2.2912.0003.000
2.34 2 4
2.3442.0004.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 8a
legend
a d
b e
c f
1070

a :: number of noun phrases

b :: (a) relative to number of all noun phrases

c :: number of noun phrases per noun

d :: number of unique phrases

e :: (c) relative to (a)

f :: number of unique noun phrases per unique noun

bar :: normalized ratio of (a-c):c

Table 8b
legend
a b c
102080

a :: average noun phrase size, in words

b :: largest noun phrase size in 50% of content

c :: largest noun phrase size in 90% of content

bar :: proportion of a:b:c


Table 8
commentary

The total number of concepts, as measured by noun phrases was similar between both candidates, with Clinton having slightly more (+5.8% (237 vs 224)). The length of the noun phrases was very similar between the candidates.

Exclusive and Shared Noun Phrase Count and length

Table 9a
exclusive and shared noun phrase count
Counts of exclusive and shared noun phrases in words and per noun.
speaker noun phrase count
all top-level
Hillary Clinton
370 229
42.1% 61.9%
141229
337 220
91.1% 65.3%
117220
Donald Trump
434 217
49.4% 50.0%
217217
385 217
88.7% 56.4%
168217
both candidates
74 17
8.4% 23.0%
5717
22 8
29.7% 36.4%
148

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 9b
exclusive and shared noun phrase length
Average and 50%/90% cumulative length of noun phrases, in words.
speaker noun phrase length
all top-level
Hillary Clinton
2.40 2 4
2.3972.0004.000
2.42 2 4
2.4242.0004.000
Donald Trump
2.31 2 3
2.3132.0003.000
2.35 2 4
2.3512.0004.000
both candidates
2.03 2 2
2.0272.0002.000
2.09 2 3
2.0912.0003.000

Fields with (e.g. 155) link to data files and Wordles. Hover over the field to show these links.

Table 9a
legend
a c
b d
1070

a :: number of noun phrases

b :: (a) relative to number of all noun phrases

c :: number of unique phrases

d :: (c) relative to (a)

bar :: normalized ratio of (a-c):c

Table 9b
legend
a b c
102080

a :: average noun phrase size, in words

b :: largest noun phrase size in 50% of content

c :: largest noun phrase size in 90% of content

bar :: proportion of a:b:c


Table 9
commentary

The number of exclusive noun phrases to each candidate was also similar, about 220.

Interestingly, the candidates only referenced 17 concepts that the other mentioned as well. Among these were Barack Obama, biggest tax cuts, fair share, first place, good deal, wealthy people, white house and worst things.

Windbag Index

The Windbag Index is a compound measure that characterizes the complexity of speech. A low index is indicative of succinct speech with low degree of repetition and large number of independent concepts.

Table 10
windbag index
Windbag Index for each speaker. The higher the value, the more repetitive the speech.
speaker Windbag Index
index value index terms
Hillary Clinton
0
+0.0%
0
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
+0.0% +0.0% +0.0% +0.0% +0.0% +0.0% +0.0% +0.0%
<div>2707.000 6226.000</div><div>1191 2707.000</div><div>558 1116.000</div><div>389 781.000</div><div>259 461.000</div><div>62 156.000</div><div>237 407.000</div><div>224 237</div>
Donald Trump
0
+0.0%
0
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
+0.0% +0.0% +0.0% +0.0% +0.0% +0.0% +0.0% +0.0%
<div>3268.000 7941.000</div><div>1034 3268.000</div><div>506 1333.000</div><div>318 860.000</div><div>275 598.000</div><div>51 201.000</div><div>224 471.000</div><div>220 224</div>
Table 10
legend
The Windbag Index is 1/(t1*t2*...*t9) where t1,t2,...,t8 are

t1 :: fraction of words that are non-stop

t2 :: fraction of non-stop words that are unique

t3 :: fraction of nouns that are unique

t4 :: fraction of verbs that are unique

t5 :: fraction of adjectives that are unique

t6 :: fraction of adverbs that are unique

t7 :: fraction of noun phrases that are unique

t8 :: fraction of noun phrases that are top-level


Large individual terms t1...t9 contribute to a smaller index.

The percentage values below the index and each term are relative differences to the other speaker's corresponding term (i.e. 100*(a-b)/b where a is the value for one speaker and b for the other).
Table 10
commentary

Trump's Windbag Index is off the chart, +490.0% (1,003 vs 170) larger than Clinton's. This is a dubious accomplishment and is almost twice as high as I've seen in all previous debates. In fact, the highest I've seen previously was 754, delivered by Romney in his 2nd debate vs Obama.

Clinton's Index is surprisingly low—perhaps even too low?

Word Clouds

In the word clouds below, the size of the word is proportional to the number of times it was used by a candidate (method details).

Not all words from a group used to draw the cloud fit in the image — less frequently used words for large word groups may fall outside the image.

All Words for Each Candidate

Each candidate's debate portion was extracted and frequencies were compiled for each part of speech (noun, verb, adjective, adverb), with words colored by their part of speech category.

The distribution of sizes within a tag cloud follows the frequency distribution of words. However, word size cannot be compared between clouds, since the minimum and maximum size of the words is fixed.

Debate Word Cloud for Hillary Clinton - all words

Debate tag cloud for Hillary Clinton

Debate Word Cloud for Donald Trump - all words

Debate tag cloud for Donald Trump
commentary

Word clouds are fun, even if they're very 1990's.

What did Clinton talk about? "Think", "people", "good" and "well" are all common words. As is "nuclear" and "American" and "important".

Trump repeated "country", "many", "just" and had a balance of "good" and "bad".

Exclusive Words for Each Candidate

The clouds below show words used exlusively by a candidate. For example, if candidate A used the word "invest" (any number of times), but candidate B did not, then the word will appear in the exclusive word tag cloud for candidate A.

Words exclusive to Hillary Clinton

Debate tag cloud for Hillary Clinton

Words exclusive to Donald Trump

Debate tag cloud for Donald Trump
commentary

These clouds are fun because they show what the other candidate didn't say.

Shockingly, Trump never said "American" (only in the context of "African-American", which Clinton said many times too). He also didn't say "justice", "working", "everyone" and "information".

Trump, on the other hand, said "politicians", "agree" (surprise here), "tremendous" (no surprise here) and "totally".

Part of Speech Word Clouds

In these clouds, words from each major part of speech were colored based on whether they were exclusive to a candidate or shared by the candidates.

The size of the word is relative to the frequency for the candidate — word sizes between candidates should not be used to indicate difference in absolute frequency.

Cloud of noun words, by speaker

commentary

If we look at nouns unique to each speaker, Clinton's "justice", "information" and "everyone" stand out. Trump focused on "politicians", "nobody" and "chicago".

Cloud of verb words, by speaker

commentary

If we look at verbs unique to each speaker, Clinton's "working", "provide" and "invest" counters Trump's "leaving", "agree" and "losing".

Cloud of adjective words, by speaker

commentary

I still can't believe that Trump never said "American" or "foreign". Clinton didn't say "tremendous" or "greatest" (thankfully).

Cloud of adverb words, by speaker

commentary

Clinton had "deeply" and "finally" and "basically" while Trump said "totally", "soon" and "extremely". There's a sentence right there for you: "totally extremely soon".

Cloud of all words, by speaker

commentary

The tag clouds for each part of speech above are combined in this cloud.

Clinton has "working", "american" and "information" and Trump has "tremendous", "countries" "leaving".

Word Pair Clouds for Each Candidate

word pairs for Hillary Clinton

^ adjective/adjective by Hillary Clinton
^ adjective/adverb by Hillary Clinton
^ adjective/noun by Hillary Clinton
^ adjective/verb by Hillary Clinton
^ adverb/adverb by Hillary Clinton
^ adverb/noun by Hillary Clinton
^ adverb/verb by Hillary Clinton
^ noun/noun by Hillary Clinton
^ noun/verb by Hillary Clinton
^ verb/verb by Hillary Clinton

word pairs for Donald Trump

^ adjective/adjective by Donald Trump
^ adjective/adverb by Donald Trump
^ adjective/noun by Donald Trump
^ adjective/verb by Donald Trump
^ adverb/adverb by Donald Trump
^ adverb/noun by Donald Trump
^ adverb/verb by Donald Trump
^ noun/noun by Donald Trump
^ noun/verb by Donald Trump
^ verb/verb by Donald Trump
commentary

These are all the unique part of speech pairs for each candidate. Keep in mind that some words may be misclassified by the tagger, if their role in the sentence is highly contextual.

Pairs unique to Clinton were "jobs rising", "need communities" and "new jobs".

Trump had "bring monye", "bring back", "billions bring" and "inner cities".

Downloads

Debate transcript

Parsed word lists and word clouds (word lists, part of speech lists, noun phrases, sentences) (word clouds)

Raw data structure

Please see the methods section for details about these files.