P-10-25-20-6

John Galsworthy
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Galsworthy - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hairstyle.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  
  • To improve the readability of models non-alphanumeric symbols are likely to be removed.

Specific Edits

0 0 0 0 00 01 04 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1,035 1,850,000 10 10 10 10 10 10 10,000 100 109k 11.45 12 12 12,000 13 13 13th 14 14 145,304 15 15 15,000 15th 16,000,000,000 1620 16ths 17 17 17 17 17 18 18 1816 1830 1850 187 1876 1880 1886 1887 1887 1890 1890 1891 1892 1895 1899 19 19 190 1900 1900 1900 1901 1901 1902 1904 1906 1906 1907 1908 1908 1909 1909 1909 191 1910 1910 1910 1910 1910 1910 1911 1911 1911 1911 1911 1911 1911 1912 1912 1912 1912 1912 1912 1912 1913 1913 1913 1913 1913 1913 1914 1914 1914 1914 1914 1915 1915 1915 1915 1915 1916 1916 1916 1916 1916 1916 1916 1917 1917 1917 1917 1917 1917 1917 1917 1917 1917 1917 1918 1918 1918 1919 1919 1919 1922 1947 19th 1st 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2,000 2,100,000,000 20 20 20 20 2001 2002 2004 2009 209 20th 20th 21 2192 21st 227 2309 245 2453 25 26 2639 2683 2771 2772 2773 2774 28089 283 283 28760 289 28th 29 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2910 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2919 2919 2920 295 29th 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 30 30 305a 30th 31 31 31 32 32 32 3459 348 35 35s 37 37 38655 38k 3d 3rd.â 4 4 4 4 4 4 4 4 4.30 400 4109 42 4261 4269 43 430 4397 44 45 47 47 47 47 47 47 4764 4764 4764 4765 4d 4s 5 5 5 5 5 5 5 5 5 500 5000 5055 5056 5058 5059 5060 51 51k 53 55 598 6 6 6 6 6 6 6 6 6 6_ 60 60k 62 65 65 6th 01 01 01 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1,500,000 1,850,000 10 10 10 10,000 100,000 10th 11 11.30 120 13 13th 145,304 15 159 15th 16 1620 16ths 17 17 17 174 179 1820 1824 1832 1840 1867 1869 187 1880 1886 1890 19,000,000 1901 1904 1906 1906 1908 1909 191 1910 1910 1910 1911 1911 1911 1912 1912 1913 1913 1914 1914 1914 1914 1915 1915 1915 1916 1916 1916 1917 1917 1917 1917 1917 1917 1918 1919 1919 1947 1947 1984 1984 2 2 2 2 2 2 2 2 2 2 2 2 2 2,000 20 200 2002 2004 2009 205 207 209 20th 20th 2192 227 23,000,000 2309 245 2453 25 250,000 2639 2683 27,000,000 273 2771 2772 2773 2774 27th 28089 283 28760 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2910 2910 2910 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2919 2919 2919 2919 2920 295 29th 3 3 3 3 3 3 3 3 3,200,000 3.10 30 30 300 300 3007 305a 30th 322 328 34 38k 4 4 4 4.30 40 40 4109 42 4261 4269 43 430 4397 44 47 4764 4764 4764 4764 4764 4765 5 5 500 5055 5056 5058 5059 5060 51k 598 5th 6 6 6.30 600 60k 62 63 6th 7 7 7 750,000 7544 7th 8 8 8 8,000,000 8,500,000 8.69 800,000 83 83 88 8â 8th 8th 8th 8th 9 9 90 90,000 93by 93don't 93for 93grandpapa 93i 93i 93if 93i've 93lady 93lord 93no 93no 93nonsense 93nothing 93oh 93oh 93the 93the 93what 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 94 97 97a 97a 97i'll 97it 97the 97very 97without 9th 9th 7 7 7 7 7 7 7 7 7 7 70 700,000 7544 76 77 79 7d.â 7s 7th 7th 7th 7th 7th 7x 8 8 8 8 8 8 8 8,000,000 8,000,000,000 8,500,000 8.69 80 85 86 87 88 8th 8th 8th 9 9 9 9 9 9 9 91 92 92 93 94 95 96 97 98 99 9th 9th