Nathaniel Hawthorne
Collection
Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design. Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance. The following edits were made to the content to improve the framework:
- Words were stemmed.
- Stop Words were used.
- The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount', 'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as', 'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.
- The Reasoning Behind the Selection - These words are of high frequency, non-unique generality. They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling. There are other words that could be included or excluded, as the method of removal isn’t intended to be exact. However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model. That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model.
- Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially. That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.
Specific Cautionaries
The following cautionaries are more specific to the Hawthorne - Collection:
- There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage. Some low-frequency of numbers meshing with words were removed as well. All combinations were removed to improve the usability and clarity of the content being modeled informatically.
- No words were removed, other than what is listed on the Stop Word list. These words were removed only for the framing and analytic stages. Words are returned during the network, layering, and detailing stages of modeling.
- Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model. The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal. Exceptions will be listed in the "specific edits" section.
- Split words that are merged back together, if any, will be listed in specific edits.
- The userability standard is used moderately. That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process. The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.
- This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hairstyle. Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes. In this way, the unstructured informational environment differs from the structured informational environment.
0
0
0
0
0
0
00
00443
02
02
04.29.93
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1.25
1.75
1;48
10
10
10
10
10
10
10
10
10,000
10003
10135
10135
10135
10t
10th
10th
10th
10th
10th
10th
10th
10th
10th
10th
10th
10th
11
11
11
11
11
11
11
11
11
11
113
113547
11th
11th
11th
11th
11th
11th
11th
11th
11th
11th
11th
11th
12
12
12
12
12
123
1240
12mo
12th
12th
12th
12th
12th
12th
12th
12th
12th
12th
12th
12th
12th
13
13
13
13
13
13
1300
133
137
13707
13707
1388
13th
13th
13th
13th
13th
13th
13th
13th
13th_
14
14
14
14
14
14
14
141
1437
14th
14th
14th
14th
14th
14th
14th
14th
14th
14th
14th_
15
15
15
1500
15697
15697
1571
1577
159
1599
15th
15th
15th
15th
15th
15th
15th
15th
15th
15th
15th
15th
16
16
16
161
1612
1613
1614
1620
1620
1626
1640
1642
1644
1648
1658
1659
1661
1667
1678
1689
1689
1690
1692
1692
1692
1697
16mo
16th
16th
16th
16th
16th
16th
16th
16th
16th
17
17
17
17
17
1706
1709
1726
1727
1731
1734
1738
1745
175
1763
1763
1765
1765
1767
1768
1770
1770
1770
1771
1773
1773
1774
1775
1775
1775
1776
1776
1777
1777
1779
1781
1781
1781
1783
1783
1784
179
1790
1794
17th
17th
17th
17th
17th
17th
17th
17th
17th
18
18
18
18
18
1802
1803
1804
1804
1806
1807
1807
1808
1809
1810
1812
1813
1815
1820
1820
1825
1826
1826
183
1831
1832
1833
1834
1834,1835
1835
1835
1836
1836
1837
1837
1837
1838
1838
1838
1838
1838
1838
1839
1839
1839
1840
1840
1840
1841
1841
1841
1841
1841
1842
1842
1842
1842
1842
1842
1842
1842
1843
1843
1843
1844
1844
1844
1845
1845
1845
1846
1846
1847
1850
1850
1850
1850
1850
1851
1851
1851
1851
1851
1851
1851
1851
1851
1852
1852
1852
1852
1853
1853
1854
1855
1855
1856
1856
1857
1857
1857
1858
1858
1859
1859
1860
1860
1860
1860
1860
1862
1862
1863
1863
1864
1864
1864
1864
1867
1868
1870
1870
1871
1871
1871
1872
1872
1872
1873
1876
1877
1879
1879
188
1882
1882
1882
1882
1883
1883
1883
1885
1888
1891
1891
1896
18mo
18th
18th
18th
18th
18th
18th
18th
18th
19
19
19
19
19
1902
1903
1903
1903
1909
191
1916
1916
1916
1926
1926
197
1972
1992
1996
1997
1999
19th
19th
19th
19th
19th
19th
19th
19th
19th
19th
1st
1st
1st
1st
1st
1st
1st
1st
1st
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
20
20
20
20
20
20
2000
2001
2002
2003
2003
2003
2003
2003
2003
2003
2003
2003
2003
2003
2003
2003
2003
2004
2004
2005
2005
2005
2005
2005
2006
2007
2007
2007
2007
2008
2008
2008
2008
2008
2008
2010
2010
2010
2010
2010
2010
2010
2011
2011
2012
2013
2013
2014
2014
2014
2014
2016
2016
2017
2017
204
2081
2081
20th
20th
20th
20th
20th
20th
21
21
21
211
218
2181
2181
2182
2182
21st
21st
21st
21st
21st
21st
21st
22
22
22
22
22
22
22
22
22
22d
22d
22d
22d
22d
22d
22d
22d
22d
23
23
23
23
23
23
23
23
23
23068
23068
233
23d
23d
23d
23d
23d
23d
23d
24
24
24
24
24
24
24
249
24th
24th
24th
24th
24th
24th
25
25
25
25
25
25
25th
25th
25th
25th
25th
25th
25th
25th
26
26
26
26
26
26
263
26th
26th
26th
26th
26th
26th
26th
26th
26th
27
27
27
27
27
27
27
27
27
27
27th
27th
27th
27th
27th
27th
27th
27th
27th
27th
28
28
28
28
28
281
289
28th
28th
28th
28th
28th
28th
28th
29
29
29
294
29th
29th
29th
29th
2d
2d
2d
2d
2d
2d
2d
2d
2nd
2nd
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
30
30
30
30
30
300
30th
30th
30th
30th
30th
30th
30th
30th
30th
31
31
31
31
313
31st
31st
31st
31st
32
33
33
332
338
34
34;48
358
36
37
370
38
393
3d
3d
3d
3d
3d
3d
3d
3d
3d
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
40
40
404
4069285
447
455
48
485
485
49
49
49
4956
4th
4th
4th
4th
4th
4th
4th
4th
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5,1843
50
50
50
50,000
500
512
512
512
513
513
528806
53
53
53
56
57
58
5th
5th
5th
5th
5th
5th
5th
5th
5th
6
6
6
6
6
6
6
6
6
6
6
6
6
60
63
65
6th
6th
6th
6th
6th
6th
7
7
7
7
7
7
7
7
7
7085
7085
7119
7183
7372
77
77
77
7877
7877
7879
7879
7880
7880
7937
7937
7th
7th
7th
7th
7th
7th
7th
7th
7th
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8_
8088
8088
8089
8089
8090
8090
8090
8091
8091
81
8207
8207
8429
8429
85;4
87
88;4
8th
8th
8th
8th
8th
8th
8th
8th
8th
8th
8th
8th
8vo
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9;48
9;48081;8
91
92
92
9201
9201
9202
9202
9202
9203
9203
9204
9204
9205
9205
9206
9206
9206
9207
9207
9208
9208
9209
9209
9210
9210
9211
9211
9211
9212
9212
9213
9213
9214
9214
9214
9215
9215
9216
9216
9217
9217
9218
9218
9219
9219
9220
9220
9221
9221
9221
9222
9222
9222
9223
9223
9223
9224
9224
9224
9225
9225
9225
9226
9226
9226
9227
9227
0
0
0
00
08
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1.25
10
10
10
10
10
100
100,000
100,000,000
10003
10135
10135
10135
10135
10th
10th
10th
11
11
11
1100
1109
113
113547
1150
11th
11th
11th
11th
11th
12
12
12
12
12
12th
12th
12th
12th
13
13
13707
13th
13th
13th
13th
13th
13th
13th_
14
14
14
1437
14th
14th
14th
14th
14th
14th
15.60
15697
15697
1582
159
1599
15th
15th
15th
15th
16
1612
1613
1613
1626
1631
1642
1642
1644
1649
1665
1667
1685
1692
1692
1692
16th
16th
16th
17
17
1706
1709
1717
1721
1726
1730
1734
1738
1747
175
1754
1759
1760
1763
1769
1770
1776
1784
17th
17th
17th
18
18
1803
1804
1806
1807
1812
1820
1821
1824
1834,1835
1836
1838
1838
1838
1839
1840
1841
1841
1842
1842
1842
1842
1844
1845
1845
1846
1851
1852
1852
1852
1853
1855
1857
1858
1858
1859
1862
1863
1863
1868
1870
1871
1872
1872
1876
1876
1879
1882
1882
1882
1883
1883
1883
1883
1885
1891
18th
18th
18th
19
1903
1916
1916
1916
1916
1926
1926
1972
1996
1997
1999
19th
19th
19th
19th
19th
19th
19th
19th
19th
1st
1st
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2,000
20
20
20
2000
2002
2003
2003
2003
2003
2003
2004
2005
2005
2006
2007
2007
2008
2010
2011
2011
2012
2013
2014
2016
2017
204
2043
2081
2081
21
21
21
211
218
2181
2181
2182
2182
21st
21st
22
22d
22d
23
23
23068
23068
233
23d
23d
23d
24
24
249
24th
24th
24th
25
25
25
25th
25th
25th
26
26
26
263
26th
26th
26th
27
27
27th
27th
27th
27th
27th
28th
28th
28th
29
29th
29th
29th
2d
2d
2d
2nd
3
3
3
3
3
3
3
3
3
3
3
30th
30th
30th
30th
31
31
32
323
33
33
33
335
358
37
370
38
38
39,620
3d
3d
3d
3d
4
4
4
4
4
4
4
4
40
400
48
49
4th
5
5
5
5
5
5
5
5,1843
50
500
51
512
512
512
512
513
513
53
53
53
56
5th
5th
5th
6
6
6
6,000
60
65
67
68
6th
6th
6th
7
7
7
7
7
7085
7085
7119
72,000
7372
757
77
77
77
77
7877
7879
7880
79
7937
7th
7th
7th
7th
7th
7th
8
8
8
8
80
8088
8089
8090
8090
8090
8091
8207
8207
8429
8429
87
8th
8th
8th
8th
9
9
9
9
90
90
900
9201
9201
9202
9202
9202
9202
9203
9203
9204
9204
9205
9205
9206
9206
9206
9206
9207
9207
9208
9208
9209
9209
9210
9210
9211
9211
9211
9211
9212
9212
9213
9213
9214
9214
9214
9214
9215
9215
9216
9216
9217
9217
9218
9218
9219
9219
9220
9220
9221
9221
9221
9221
9222
9222
9222
9222
9223
9223
9223
9223
9224
9224
9224
9225
9225
9225
9226
9226
9226
9227
9227
9227
9227
9228
9228
9228
9229
9229
9229
9230
9230
9230
9231
9231
9231
9232
9234
9234
9234
9234
9235
9235
9235
9236
9236
9237
9237
9238
9238
9239
9239
9239
9239
9240
9240
9240
9240
9241
9241
9242
9242
9242
9242
9243
9243
9244
9244
9244
9244
9245
9245
9246
9246
9247
9247
9248
9248
9249
9249
9250
9250
9251
9251
9251
9251
9252
9252
9252
9252
9253
9253
9254
9254
9255
9255
9256
9256
9257
9257
9258
9258
976
976
9th
9th
9th
9th
9th
9th
9227
9228
9228
9228
9229
9229
9229
9230
9230
9230
9231
9231
9231
9232
9232
9232
9234
9234
9234
9235
9235
9235
9236
9236
9237
9237
9238
9238
9239
9239
9239
9240
9240
9240
9241
9241
9242
9242
9242
9243
9243
9244
9244
9244
9245
9245
9246
9246
9247
9247
9248
9248
9249
9249
9250
9250
9251
9251
9251
9252
9252
9252
9253
9253
9254
9254
9255
9255
9256
9256
9257
9257
9258
9258
93
94
95
96
96
97
976
976
98
99
99
9th
9th
9th
9th
9th
9th
9th
9th
9th
9th