P-8-7-20-2

Louisa May Alcott
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Alcott - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hairstyle.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  

Specific Edits

0 0 0 0 0 0 00 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1,000 1,000 1,000 1.00 1.00 1.00 1.25 1.25 1.25 1.50 1.50 1.50 1.50 1.75 10 10 10 10 10 10 10 10 10 10 10 10 10.00 100 100 100 100 100 100 100,000 101 102 10360 10360 104 105 106 108 109 10s 10s 10s 10s 10s 10s 10s 10s 10th 10th 10th 10th.â 11 11 11 11 11 11 11 11 112 113 114 116 116 117 117 118 118 118 11s 12 12 12 12 12 12 12.00 120 120 124 1240 12mo 12mo 12s 12s 12s 12s 12s 12s 12s 12th 13 13 13 13 13 130 131 133 135 13a 14 14 14 14 14 14 14 14 141 146 146 146 147 148 1493 14ll 14th 14th 14th 14th 15 15 15 15 15 15 15 15 150 152 156 158 1588 159 15th 16 16 16 16 16 16 16 160 160 160 161 161 162 163 164 1640 168 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16mo 16s 16s 16s 16s 16th 16th 17 17 17 17 17 17 17 17 17 17 171 172 1739 175 175 1775 178 178 179 1793 1799 17th 18 18 18 18 18 18 18 18 18 18 1808 181 1812 1815 182 183 1832 1832 1839 184 184 1840 1840 1840 1840 1843 1843 1843 1845 1846 1847 1847 185 1850 1850 1850 1851 1853 1854 1854 1855 1855 1855 1855 1855_ 1856 1856 1856 1856_ 1857 1857 1858 1859 1859 186 1860 1860 1860 1861 1861 1862 1863 1863 1863 1863 1863 1864 1865 1866 1867 1867 1867 1868 1868 1869 1869 1869 187 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1870 1871 1872 1872 1872 1872 1873 1873 1874 1874 1875 1875 1875 1875 1875 1875 1876 1877 1877 1877 1877 1878 1878 1878 1879 1879 1879 188 1880 1880 1880 1880 1880 1880 1880 1881 1881 1881 1881 1881 1882 1882 1882 1883 1883 1883 1884 1884 1885 1885 1885 1885 1885 1885 1885 1886 1886 1886 1886 1886 1886 1886 1886 1886 1887 1887 1887 1887 1887 1887 1887 1887 1887 1887 1887 1888 1888 1888 1889 1889 189 1892 18mo 18s 18s 18th 19 19 19 190 1901 191 199 19ij 1â 1s 1s 1s 1s 1s 1s 1st 1st 1st 1st 1st 1st_ 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2,000 2,50 20 20 20 20 20 20 20 20 20 20 20 20 20 20 200 2003 2003 2005 201 202 205 207 20f 20th 20th 21 21 21 21 21 21 21 215 219 21s 21s 22 22 22 22 220 221 22d 23 23 23 23 23 238 24 24 24 24 24 242 244 24s 24th 25 25 25 25 25 257 25s 25th 25th 25th 26 26 265 26th 26th 27 2786 2787 2788 27th 27th 27th 28 2804 28th 28th 28th 29 29 29 29 29 29 29 29th 29th 2d 2d 2l 2nd 2nd 2nd 2nd 2s 2s 2s 2s 2s 2s 2s 2s 2s 2s 2s 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3,000 3,000 3,000 3,250 3.00 3.50 30 30,000 30s 30th 31 31 31s 31s 31s 31st 31st 32 32mo 32s 33 34 3499 35 35 36 366 36s 36s 37 3795 38 380 3806 3837 38567 38655 3d 3rd 3rd 3rd 3rd 3s 3s 3s 3s 3s 3s 3s 4 4 4 4 4 4 4 4 4 4 4 4 4,500 40 40 40682 40683 41 42 42 42 43 43 44 44 44 45 45 46 47 49 4j 4s 4th 4th 4th 4th 4th 4th 4to 4to 4to 4to 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 50 50 50 50 50 50 514 52 5352 54 55 570 57309 57310 58 5830 5s 5s 5s 5s 5s 5s 5th 6 6 6 6 6 6 6 6 6 6 6 6 6 6 60 60 600 6004 61 61 61 64 64 64 64d 67 69 69 6â 6b 6d 6d 6d 6d 6d 6d 6d 6d 6d 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6tening 6th 6th 6th 6us 6us 6wen 6wen 7 7 7 7 7 7 7 7 7 7 7 7 70 70 700 700 72 723 74 75 75 75 76 76 77 78 79 79 7s 7s 7s 7s 7s 7s 7s 7s 7th 00 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1,000 1.00 1.00 1.00 1.25 1.50 1.50 1.50 1.50 1.50 1.50 1.50 1.50 10 10 10 10 10 10 10.00 100 100 100 100 100,000 102 10360 104 106 108 10th 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 110 111 112 114 116 117 118 118 119 12 12 12 12 12 12 120 120 124 126 127 12â 12mo 12th 13 13 13 130 131 134 1396 13a 13d 14 14 14 14 142 144 146 146 146 148 1493 15 15 15 15 15 15 15 15 152 154 156 158 159 159 15th 16 16 160 160 161 161 161 162 163 165 168 16mo 16mo 16mo 16mo 16s 16s 16th 17 17 17 17 170 171 172 1739 174 176 178 178 17th 17th 18 18 18 18 180 182 182 1820 183 1832 184 1840 1840 1840 1840 1841 1843 1845 185 1850 1850 1851 1855 1855 1855 1855_ 1856 1856 1859 186 186 1860 1860 1861 1861 1861 1861 1861 1863 1863 1863 1863 1863 1866 1867 1867 1867 1868 1869 1870 1870 1870 1872 1872 1872 1872 1874 1875 1875 1876 1876 1876 1877 1877 1877 188 1880 1880 1881 1881 1881 1882 1882 1882 1883 1885 1885 1885 1886 1886 1887 1887 1887 1887 1888 1889 1889 189 18s 18th 19 19 19 190 191 192 198 19ij 1â 1s 1s 1st 1st 1st 2 2 2 2 2 2 2 2 2 2 2 2 2 2,000 2.50 20 20 20 20 20 200 200 200 200 200 2005 201 202 204 205 207 20f 20th 20th 21 21 21 21 21s 21st 22 22 22 22 22 22 220 221 23 235 238 24 24 24 242 24th 24th 25 25 255 257 25th 25th 26 265 268 26th 27 2786 2787 2788 27th 27th 2804 28th 28th 28th 28th 28th 29 29 29 29th 29th 2d 2d 2d 2nd 2nd 2s 2s 3 3 3 3 3 3 3 3 3 3 3,000 3.30 30 30 30,000 300,000 30s 30th 31 31 31 32 33 34 3499 35 36 36 36s 36s 3795 38 3806 3837 38567 3d 3rd 3s 4 4 4 4 4 4 4 4 4 4 4 4,000 40 400 402 40682 40683 41 42 42 43 44 45 45 46 47 49 4j 4th 4th 4th 4to 4to 5 5 5 5 5 5 5 5 50 50 50 50 50 50 50 500 514 52 5352 54 54th 55 570 57309 57310 58 5830 5s 5s 5s 5s 5s 5s 5s 5s 5s 6 6 6 6 6 6 60 60 600 600,000,000 61 64 677 68 69 6b 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6d 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6s 6tening 6th 6th 6us 6wen 6wen 7 7 7 7 7 7 7 70 72 77 79 79 7th 7th 8 8 8 8 8 8 8 8 80 81 82 82 83 84 85 85 87 88 8kvkn 8th 8th 8th 8th 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 9 90 90 90 90 90 91 91 92 92 94 95 96 96 98 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 80 80 81 81 81 82 82 82 83 83 84 84 85 85 85 86 8677 87 87 88 8kvkn 8s 8s 8th 8th 8th 8th 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 8vo 9 9 9 9 9 9 9 9 90 90 91 91 92 92 92 93 94 94 94305 95 95 96 96 96 97 98 98 99 99