P-6-25-20-1

Fyodor Dostoyevsky
Collection
Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Dostoyevsky - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hairstyle.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  

Specific Edits

0 0 0 003 009 010 011 012 022 024 027 029 031 033 034 035 036 037 039 040 041 042 044 047 048 049 052 059 064 069 074 076 079 083 086 090 096 097 098 1 1 1 1 1 1 1 1 1 1 10 100 100 1000 1001 1002 1003 1005 101 101 1015 102 102 1023 1025 103 105 106 107 108 109 10th 11 111 112 112 113 113 115 116 117 117 118 11th 11th 121 124 129 129 12th 12th 131 133 135 138 139 139 13th 14 14 141 146 14th 15 15 150 151 152 153 154 155 156 158 159 162 163 165 165 167 168 168 169 17 170 171 171 173 175 175 176 177 177 180 182 183 184 186 1861 187 188 189 18th 19 19 191 1916 193 193 194 195 195 196 197 198 199 19th 19th 1st 1st 1st 2 2 2 2 2 200 200 202 203 204 205 205 205 206 206 208 208 20th 21 21 210 210 214 215 215 216 217 217 218 218 219 21st 220 221 223 223 224 225 227 229 229 22nd 23 2302 232 234 235 238 239 23rd 24 24 241 243 245 246 247 248 249 249 251 252 253 2554 256 257 258 259 26 260 261 262 263 2638 265 266 267 268 269 26th 27 270 271 272 273 274 274 275 275 277 279 27th 27th 27th 28054 281 281 283 283 284 284 285 285 287 287 289 28th 28th 28th 290 291 292 292 293 295 295 296 297 297 299 29th 2nd 3 3 3 301 302 303 305 307 307 309 30th 31 311 313 314 315 316 317 319 32 321 322 325 327 329 33 331 334 336 337 338 344 346 346 347 349 35 350 351 356 356 357 359 36 36 36 361 362 363 365 365 366 367 37 371 372 373 373 375 376 377 378 379 379 38 380 380 382 383 384 387 388 389 39 390 391 392 392 394 395 396 4 40 402 405 408 409 41 410 411 413 415 416 419 419 421 423 425 425 429 43 430 431 431 432 432 433 435 437 439 440 443 445 445 447 448 448 449 45 450 451 452 452 453 454 455 456 456 457 458 458 459 46 461 461 462 465 466 467 469 469 47 47 471 472 473 475 475 476 477 477 478 479 480 481 481 482 484 484 485 487 487 49 491 493 493 494 495 495 497 498 499 499 5 5 5 5 5 500 501 503 504 505 507 508 509 51 511 511 513 515 517 519 520 521 522 523 524 525 527 528 531 532 534 535 537 538 539 540 540 541 543 543 545 545 546 547 548 549 55 551 551 552 553 553 554 555 557 558 559 559 561 561 562 563 564 565 566 567 568 569 569 571 571 572 577 577 579 58 582 584 585 586 588 589 59 590 591 591 592 593 594 595 596 597 597 599 599 5th 5th 6 6 6 6 6 60 60 600 600 601 603 604 607 609 61 610 611 611 612 612 613 615 616 617 618 619 62 620 621 623 624 625 627 629 63 630 631 633 633 634 635 635 636 637 637 639 639 641 644 644 645 647 65 651 651 653 654 655 656 658 659 662 663 663 663 664 668 67 671 673 674 675 677 678 678 679 68 681 681 682 683 684 684 685 686 687 688 689 689 691 693 694 696 698 699 699 7 7 7 70 700 701 702 702 703 704 705 707 708 710 712 712 713 714 715 715 716 720 720 721 722 724 724 725 726 727 727 728 728 729 733 735 735 736 737 738 738 739 740 742 742 743 743 745 745 749 750 751 753 754 754 755 757 758 759 76 760 761 763 764 765 768 769 769 770 771 771 772 773 776 778 780 780 783 784 785 786 787 788 789 79 790 790 791 793 794 794 795 796 797 799 7th 8 8 8 800 800 802 803 804 805 807 809 809 81 810 8117 812 821 822 823 824 825 826 827 827 83 830 832 833 834 834 835 835 836 837 839 839 84 840 841 846 846 85 850 851 851 853 854 855 856 857 8578 859 86 860 863 864 866 867 867 868 869 87 870 872 874 875 876 878 879 88 880 881 882 883 885 886 889 89 891 893 895 897 899 8th 8th 9 9 9 90 90 903 905 91 910 914 915 917 919 92 921 929 93 937 938 939 94 943 945 95 950 958 964 97 970 970 971 972 973 974 975 977 979 981 982 983 985 987 989 99 992 994 995 996 997 999 9th