P-3-12-20-3

Thomas Carlyle
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Carlyle - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hair style.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  

Specific Edits

0 0 01 01 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 10 10.10 101 103 105 1051 1051 1051 106 107 109 10b 11 11 11 11.11 113 114 115 117 119 121 122 123 125 127 129 13 13 13.13 130 131 131 132 133 135 137 139 14 141 143 146 147 148 149 15 15 15 15.15 151 152 153 155 157 159 1599 16 16 16.16 161 161 163 164 165 166 167 169 17 1709 171 171 1711 173 1749 175 1751 1774 1780 1783 1784 1784 1785 1786 1787 1788 179 1791 1791 1792 1792 1793 17th 18.18 181 182 1822 1824 1825 1825 1828 1829 183 1830 1830 1830 1831 1831 1831 1831 1831 1831 1831 1832 1832 1832 1832 1833 1833 1833 1835 1835 1837 1837 1838 1839 184 185 186 1869 187 189 1897 19 19 19.19 191 193 196 197 199 1i 2 2 2 2 2 2 2 2 2 2 2 2.2 2008 201 2019 203 205 207 208 209 21 21 21.21 210 211 213 215 217 22.22 220 220 221 223 227 228 23 230 231 234 236 237 239 24.2i 242 245 245 247 248 249 25 25 250 253 255 257 26 26 266 267 26y 27.27 270 271 273 275 277 279 28 28 280 281 283 285 286 287 287 28tk 29 29 29 29 29 29 29 29 29 29 29 29 29 29 290 290 291 292 293 295 296 298 299 2d 2u7 3 3 3 3 3.3 30 301 303 305 307 309 30g 30th 31 31 310 311 312 313 315 316 317 319 321 323 325 327 33 331 335 338 339 339 34 340 341 343 344 345 346 347 352 354 362 37 370 374 378 382 39 3og 4 4 4 40 400 404 41 410 417 419 42 422 424 427 428 429 43 431 431 434 435 437 438 439 44 445 45 451 453 46 46 461 464 469 47 473 475 476 479 481 482 483 484 485 486 487 489 489 490 491 492 493 494 495 499 5 5 5 5 50 505 507 510 513 514 517 518 520 522 526 534 538 544 545 547 549 55 551 552 553 554 555 559 561 563 564 567 57 570 571 573 574 575 580 581 586 587 59 595 597 598 599 5erficial 6 6 60 601 0 0 0 01 01 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1,800,000 101 105 1051 106 107 109 10b 11 117 119 122 123 1278 129 130 131 135 1390 141 146 147 148 149 152 155 157 16 16 161 164 166 167 1708 1709 1752 1782 1783 1787 179 1791 1792 17th 181 182 1824 1825 1831 1831 1831 1831 1832 1832 1832 1833 1835 1837 1837 1839 184 186 1869 187 189 191 193 196 197 1i 2 2 2 2 2 2008 2019 207 209 21 210 211 213 218 220 220 223 228 230 231 234 236 237 239 242 245 247 25 250 26 26 266 26y 270 277 279 28 280 281 283 286 287 28tk 29 29 290 292 293 295 296 298 299 2d 3 3 3 30,000 309 310 311 312 313 315 316 319 325 327 331 340 343 344 346 352 353 354 362 37 370 374 375 377 378 381 382 383 39 391 3g1 3og 40 400 404 410 42 422 424 428 429 430 431 431 434 435 437 438 439 44 445 451 46 464 476 479 482 486 489 490 492 493 494 495 4ta3 5 507 513 514 517 518 520 522 526 534 538 544 545 549 551 552 553 555 558 559 564 57 570 573 574 580 581 586 587 595 597 598 599 6 60 601 601 609 611 613 616 619 623 626 628 629 63 631 632 66 66 694 6m 7 7 7 71 71 75 7th 80 833 864 89 8vo 8vo 8vo 8vo 9 9 90 90 92 93 94 98 601 609 611 613 616 619 623 626 628 629 63 63 631 632 633 66 66 678 679 694 6i 6m 7 7 7 7 71 73 75 7th 8 83 833 837 85 864 8vo 8vo 8vo 9 90 92 93 94 97 98