P-10-25-20-5

Arnold Bennett
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Bennett- Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hairstyle.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  
  • To improve the readability of models non-alphanumeric symbols are likely to be removed.

Specific Edits

0 0 0.50 00 0001 0007 0008 0009 0010 0011 0029 0035 0040 0045 0052 0053 0060 0063 0073 0076 0091 0094 0099 01 0106 0117 0123 0126 0136 0157 0161 0170 0171 0175 0180 0185 0186 0189 02 0206 0216 0221 0227 0232 0245 0246 0251 0267 0288 0297 03 0301 0308 0313 04 0469 0475 05 08 0d 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.20 1.50 10 10 10 10,000 100 1000 1054 1067 10s 10s 10s 11 112 11360 114 116 11641 118 12 126 12687 12773 12779 12794 12912 12995 13 13723 138 13894 13s 14 144 15 15 15 15 15 150 156 16 16 16 160 164 16th 17 171 172 175 18 18 1867 1875 1888 1889 1897 1898 18s 19 19 1904 1904 1905 1907 1908 1908 1909 191 1910 1910 1910 1911 1911 1911 1911 1912 1912 1912 1912_ 1913 1913 1913 1914 1914 1915 1915 1916 1919 1919 1919_ 1922_ 193 195 1s 1s 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2,250 20 20 20 20 20 20 200 2000 2002 2003 2004 2009 201 2011 2011 2012 2016 21 21 215 22 22 22 22 22 225,000 2250 226 23 23 234 239 24 249 24th 25 25 250 250 256 25th 26 262 262 27 27 273 28 28 28 289 29 29860 29860 2d 2nd 2s 2s 2s 2s 3 3 3 3 3 3 3 3 30 30 30 30 30 30 30 30 308 31 31 32 324 33 337 34 34 34 35 36 36 37 38 38 39 3s 4 4 4 4 4 4 4 4 4 4 4,500 40 40 40,000 400 40168 41 42 42 43 44 442 45 45 4500 46 48 49 49 4d 4in 4s 5 5 50 500 50c 52247 52247 5247 5247 5247 55 55â 56 562 58 5th 6 6 6 6 6 6 6 6 6 6 6d 6d 6d 6s 6s 6s 6s 6th 7 7 7 7 7 7 7 7 7,000 7.8 70â 73,065 74 79 7th 2 262 3 6 90 8 8 8 8 8 8 8 8 8 8 8 8,000 8405 86 8d 8s 8th 8vo 8vo 9 9 9 9 90 92 93 94 94 95 96 96 97 98 99 99 9th