P-2-7-20-2

Thomas de Quincey
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Quincey - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hair style.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  

Specific Edits

0 0 0 0 0 0 0130ã 01min 03 08 0ai3pa 0f 0mm 0n 0n 0ng 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.9 10 10 10 10 10 100 100 1000 103 104 106 107 107 108 109 11 11 11 11.9 110 111 113 114 115 116 117 117 118 119 12 12 12 12 12 12 121 122 123 125 126mm 127 129 13 13 13 13.6 13.6 13.6 132 134 135 137 138 13m 14 14 140 141 1411ã 142 144 145 146 147 15 15 150 151 152 1520 154 154 1547 157 1572 158 1595 16 16 160 1606 161 1633 164 1649 165 167 167 168 168 1684 169 1691 1698 1698 17 17 17 17 17 17 17 17 170 1708 171 1710 1713ch 1714 1717 1719 1720 1720 1722 1728 173 1737 175 1757 176 1766 1768 1768 177 1770 1771 1771 1772 1775 1777 1778 1779 1779 178 1787 1787 1788 1789 179 1794 1794 1794 1796 1796 1797 1797 1797 1798 17l5 17ml 18 18 18 18,000 180 1800 1801 1804 1804 1805 1805 1806 1807 1807 1808 1808 1808 1808 1808 1809 1812 1812 1817 1819 1819 182 1821 1821 1821 1822 1822 1823 1823 1823 1826 1827 1828 1829 183 1831 1832 1832 1832 1834 1835 1836 1838 1838 184 184 1841 1841 1842 1842 1844 1845 1845 1848 1848 1848 185 1852 1856 1856 1864 188 189 189 19 19 19 19 190 192 193 193 194 196 199 1ã 1ã 1ã 1ã 1ã 1ã 1ã 1ã 1ã 1e 1e 1mm 1n 1n 1st 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 20 20 20 20 20 20,000 200 200 200 2000 204 205 208 21 21 21 21 21 210 211 212 213 215 215 216ã 217 218 219 22 22 22 220 221 222 223 224 225 227 228 229 23 23 23 231 232 233 235 236 237 239 24 24 24 240 241 241 242 244 245 246 247 248 249 25 25 25 251 252 253 254 255 256 259 26 262 263 265 267 267 268 269 27 27 27 270 271 272 274 275 276 278 28 280 281 282 283 284 285 286 287 289 29 29 29 290 292 293 294 296 298 299 2ã 2ã 2d 2dly 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3.6 30 30 300 3000 305 306 307 307 308 309 30ã 30th 31 31 31 31 311 312 312 313 313cm 314 317 32 32 32 321 322 323 324 325 326 327 328 329 33 330 331 332 332 333 334 335 336 337 338 339 34 34 340 341 342 343 344 345 346 347 348 349 35 35 350 350 351 352 353 354 359 36 360 361 362 364 368 370 371 372 373 375 376 379 38 38 380 383 384 386 387 388 389 39 39 39 392 396 398 3ã 3ã 3d 4 4 4 4 4 40 40 400 401 402 404 406 408 41 41 41 410 412 414 415 416 42 43 44 44 45 45 46 46 46 461794 48 48 49 49 4i.e 5 5 5 5 50 51 52 53 53 54 54 55 56 57 58 58 5ã 5m 6 6 6 6 6 6 6 60 60 602th 61 62 63 0 0 0 0 0 01min 08 0f 0n 0n 0ng 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.4 1.9 10 10 10 10 10 10 100 100 100 104 105 106 107 108 10cal 10cal 11 11 11.9 110 111 114 115 116 117 118 119 12 12 12 121 121 122 123 126mm 129 12mo 13 13 13.6 13116 132 134 135 138 13m 140 142 144 145 146 147 15 150 152 154 158 16 16 16 160 160 1606 164 1644 165 166 167 168 17 170 1704 171 1713ch 1745 175 176 1766 177 178 1789 1789 179 1794 1794 1796 1796 1796 1796 1797 1797 17ml 17th 18 18 180 180 1800 1803 1814 1816 1817 182 1821 1822 1826 1828 183 1832 184 184 1844 1848 1848 185 1856 1856 1858 188 18th 19 190 192 194 195 196 1e 1e 1mm 1n 2 2 2 2 2 2 2 2 2 2 2 2 20 20 20 20 200 204 207 208 21 210 211 212 213 214 215 218 22 22 220 222 224 228 23 231 232 233 236 237 239 24 240 242 244 245 246 248 25 25 25 250 252 254 256 259 26 260 262 263 267 268 270 271 272 274 276 278 28 280 282 283 284 286 288 289 29 29 290 292 294 296 298 299 3 3 3 3 3 3 3 30 300 303 306 307 308 308 30th 31 31 311 312 314 317 32 32 320 322 324 326 327 328 329 33 330 331 332 332 334 335 336 338 339 34 34 340 342 343 344 346 348 35 352 353 354 36 36 360 360 362 364 368 37 370 371 372 373 374 375 376 377 38 380 383 384 386 387 388 389 39 392 393 396 398 3d 4 4 4 40 400 402 404 406 408 41 41 41 410 412 414 415 416 416 42 42 43 43 44 45 46 46 47 48 4d 4i.e 4to 5 5 5 5 500 51 515 52 52 53 53 54 55 55 56 56 57 58 58 58 6 6 6 6 60 61 62 63 64 64 66 67 68 68 6a 7 7,000 70 70,000 72 73.6 74 76 8 8 8 8 8 8 8 8 8,000 80 8000 810 813 816 818 819 82 85 850 854 855 86 866 88 88 88 882 8d 9 9 9 9 9 9 9 9 9 90 90 92 92 94 95 96 63 64 65 67 68 68 68m 69 69 6p5u77a 6poa1 6w 7 7 7 7 7 7 7 7 7 7 70 700 71 712 72 73.6 74 742 75 76 76v 77 77 78 79 7mm 8 8 8 8 8 8 8 80 81 81 81 810 813 815 816 818 819 82 82 83 85 85 86 863 866 87 87 88 88 882 89 8d 9 9 9 9 9 9 9 9 9 9 9 9 90 90 91 92 92 92 93 94 95 96 97 99 9ã