P-12-16-19-2

Gustave Flaubert
Collection

Cautionaries are simply edits to the original content for the purposes of improving the usability and clarity of the informatic design.  Edits should focus on identifying the framework of the original content in its entirety, including redundant messages of cultural or legal significance.  The following edits were made to the content to improve the framework:
  1. Words were stemmed.
  2. Stop Words were used.
  • The Stop Word List: 'a', 'about', 'above', 'above', 'across', 'after', 'afterwards', 'again', 'against', 'all', 'almost', 'alone', 'along', 'already', 'also','although','always','am','among', 'amongst', 'amoungst', 'amount',  'an', 'and', 'another', 'any','anyhow','anyone','anything','anyway', 'anywhere', 'are', 'around', 'as',  'at', 'back','be','became', 'because','become','becomes', 'becoming', 'been', 'before', 'beforehand', 'behind', 'being', 'below', 'beside', 'besides', 'between', 'beyond', 'bill', 'both', 'bottom','but', 'by', 'call', 'can', 'cannot', 'cant', 'co', 'con', 'could', 'couldnt', 'cry', 'de', 'describe', 'detail', 'do', 'done', 'down', 'due', 'during', 'each', 'eg', 'eight', 'either', 'eleven','else', 'elsewhere', 'empty', 'enough', 'etc', 'even', 'ever', 'every', 'everyone', 'everything', 'everywhere', 'except', 'few', 'fifteen', 'fify', 'fill', 'find', 'fire', 'first', 'five', 'for', 'former', 'formerly', 'forty', 'found', 'four', 'from', 'front', 'full', 'further', 'get', 'give', 'go', 'had', 'has', 'hasnt', 'have', 'he', 'hence', 'her', 'here', 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself', 'him', 'himself', 'his', 'how', 'however', 'hundred', 'ie', 'if', 'in', 'inc', 'indeed', 'interest', 'into', 'is', 'it', 'its', 'itself', 'keep', 'last', 'latter', 'latterly', 'least', 'less', 'ltd', 'made', 'many', 'may', 'me', 'meanwhile', 'might', 'mill', 'mine', 'more', 'moreover', 'most', 'mostly', 'move', 'much', 'must', 'my', 'myself', 'name', 'namely', 'neither', 'never', 'nevertheless', 'next', 'nine', 'no', 'nobody', 'none', 'noone', 'nor', 'not', 'nothing', 'now', 'nowhere', 'of', 'off', 'often', 'on', 'once', 'one', 'only', 'onto', 'or', 'other', 'others', 'otherwise', 'our', 'ours', 'ourselves', 'out', 'over', 'own','part', 'per', 'perhaps', 'please', 'put', 'rather', 're', 'same', 'see', 'seem', 'seemed', 'seeming', 'seems', 'serious', 'several', 'she', 'should', 'show', 'side', 'since', 'sincere', 'six', 'sixty', 'so', 'some', 'somehow', 'someone', 'something', 'sometime', 'sometimes', 'somewhere', 'still', 'such', 'system', 'take', 'ten', 'than', 'that', 'the', 'their', 'them', 'themselves', 'then', 'thence', 'there', 'thereafter', 'thereby', 'therefore', 'therein', 'thereupon', 'these', 'they', 'thick', 'thin', 'third', 'this', 'those', 'though', 'three', 'through', 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'top', 'toward', 'towards', 'twelve', 'twenty', 'two', 'un', 'under', 'until', 'up', 'upon', 'us', 'very', 'via', 'was', 'we', 'well', 'were', 'what', 'whatever', 'when', 'whence', 'whenever', 'where', 'whereafter', 'whereas', 'whereby', 'wherein', 'whereupon', 'wherever', 'whether', 'which', 'while', 'whither', 'who', 'whoever', 'whole', 'whom', 'whose', 'why', 'will', 'with', 'within', 'without', 'would', 'yet', 'you', 'your', 'yours', 'yourself', 'yourselves', 'the'.

  • The Reasoning Behind the Selection - These words are of high frequency, non-unique generality.  They are simply removed to clarify the content, of a more unique terminology, during the analytic stage of modeling.  There are other words that could be included or excluded, as the method of removal isn’t intended to be exact.  However, the terms should be non-unique, of high frequency, and fully disclosed to users of the informatic model.  That is, these terms after the analytic stage are returned to the informatic model in developing the networks, layering, directionality, and detailing of the model. 
  • Implications of Selection - The methodology generalizes the unstructured information, so regardless of the nuanced changes of a stop word list; which may or may not include some unique terms, or may or may not meet a particular standard asserted as ideal; the given methodology returns these words to the corpus for the informatic modelling, and the generalized form of significant associations are consistently accounted for, even if some words of significant association were treated as stop words initially.  That is, there isn't a perfect stop word list, and lists will vary, but the informatic methodology manages these variations for a consistent outcome, so long as most non-unique terminology is removed.  


Specific Cautionaries

The following cautionaries are more specific to the Flaubert - Collection
  • There were a large variety of numbers and number-letter combinations that marked news sections. All numbers, letter-number combinations not constituting words or abbreviations were removed after the analytic modeling stage.  Some low-frequency of numbers meshing with words were removed as well.  All combinations were removed to improve the usability and clarity of the content being modeled informatically.
  • No words were removed, other than what is listed on the Stop Word list.  These words were removed only for the framing and analytic stages.  Words are returned during the network, layering, and detailing stages of modeling. 
  • Errors involving the content, such as conversion errors of words are not edited and will remain transparent to viewers of the model.  The focus is on developing trust through process and procedure, not through avenues easily manipulated, such as finely-threaded performances of perfection and cosmetic appeal.  Exceptions will be listed in the "specific edits" section.   
  • Split words that are merged back together, if any, will be listed in specific edits.
  • The userability standard is used moderately.  That is, terms like "ebook", or proper nouns, such as publisher names, or any other term reflective of the overall publication, will likely be included into the modeling process.  The models are designed to account for terms that work in different contexts, such as publication terms, that will be presented alongside the design of the actual written work, with the ideas of the given author intact.  
  • This methodology is designed to manage the unstructured informational environment, of a sound and consistent overall design, that manifests from categorical arrangements that are inconsistent and imperfect, like that of a hair style.  Even though terms, these individual hairs, will change, the overall styling, the informatic model, will remain largely the same, of a consistent arrangement of major nodes.  In this way, the unstructured informational environment differs from the structured informational environment.  

Specific Edits
0 0 0 0 0 0 01 0113112157 04h 0f 0f 0f 0f 0n 0n 0n 0r 0 0 0 01 0f 0n 0nâ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.08 100 101 103 104 105 107 109 10como 10ved 10ver 11 11 110 111th 112 1120001215 114 116 1161 117 119 1'1stiã 120 124 125 1253 126 128 13 13 132 134 135 136 137 139 14 140 141 143 144 145 146 147 148 149 15 15 150 154 156 157 160 162 165 167 168 1692 17 170 171 172 173 175 176 177 179 18 18 18 18 18 18 1809 181 182 1825 1825 1830 185 186 19 190 190 193 19333111ch 194 198 1a 1â 1â 1â 1e 1e 1e 1fâ 1her 1making 1n 1n 1n 1n 1n 1n 1n 1n 1s 2 2 2 2 2 20 201 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 204 205 209 21 210 2'11 214 215 217 219 220 222 223 228 229 232 233 234 235 237 238 24 24 242 243 246 248 24th 24th 250 252 253 255 256 258 260 263 264 267â 269 27 270 272 276 27s 28 28 28 280 281 289 291 296 297 29s 2a 2the 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 30 30 30 300 302 306 310 3'12 314 315 316 318 320 3200 322 323 324 326 327 3273 328 329 33 330 331 332 333 336 337 338 339 34 34'0â 342 344 346 34828 34828 34828 34828 35 351 352 354 356 358 358 36 360 362 363â 364 365 366 367 37 38 39 3rd 3they 4 4 4 4 4 4 40 41 42 48 4828 4828 4â 5 5 5 5 5 5 5 5 5 5 5 5 5 5,000 52 53 54 55 55 58 59 5a 5â 5â 5o 60 62 64 65 66 6sthetic 7 7 70 71 72 72 73 74 77 78 7â 7â 8 8 8 8 8 8 80 828 828 83 84 84 85 86 89 8re 9 9 9 9 90 90 97 97â 98 99 9â 9â 9ans 9chale 9e 9n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.08 10 10 10 10 100 101 102 103 104 105 107 109 10como 10ver 10ver 11 11 11 11 11 11 11 11 110 111 111 111 111 1115 111ise 111th 112 114 115 116 1161 117 118 119 12 12 12 12 12 12 120 120 121 122 124 125 1253 1253 1253 126 127 128 129 129 1290 1290 13 13 13 13 131 13111111156 132 133 134 135 1351 136 137 139 14 14 14 140 141 142 143 144 145 146 147 148 149 15 15 15 15 151 153 154 155 156 157 1587 159 16 16 160 161 162 163 165 167 168 169 169 1693 17 17 17 170 171 172 173 175 176 177 179 18 18 18 18 18 181 1819 182 183 1837 1840 1845 185 1857 186 187 19 19 19 190 190 1904 191 193 194 195 196 198 199 1a 1â 1â 1â 1â 1â 1â 1â 1deasm 1e 1e 1fâ 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1n 1t 1t 2 2 2 2 2 2 2 2 2 20 20 20 201 201 2019 203 204 205 207 209 21 210 210 2'11 212 213 214 215 217 219 22 220 221 222 223 225 227 228 23 23 23 231 232 233 234 235 237 238 24 24 24 24 241 242 243 244 245 246 247 248 249 25 25 250 2511 252 253 254 255 256 257 258 259 26 260 261 263 264 265 267â 269 27 27 270 271 272 274 275 276 277 279 27s 28 28 280 281 283 285 289 29 290 291 293 294 295 296 297 299 29s 2â 2â 2â 2at 2fru1ts 2the 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 30 30 30 300 301 302 303 305 306 307 3'09 31 31 31 310 3'12 313 314 315 316 317 318 319 32 320 321 322 323 324 325 326 327 3273 328 329 33 33 330 3308 331 332 333 335 336 337 338 339 34 34 34'0â 341 342 343 343 344 344 345 346 347 348 34828 34828 349 35 350 351 352 353 354 355 356 357 358 358 36 36 360 361 362 363â 364 365 366 367 37 37 38 38 39 39 39 3â 3â 3rd 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 40 40 41 41 42 42 43 43 43359 44 44 45 45 46 47 47 48 48 49 4â 4â 4â 4i 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 50 51 51 510ng'er 52 52 53 53 54 54 55 55 55 56 57 57 58 58 59 59 5â 5and 5furniture 5movement 5o 6 6 6 60 61 61 62 62 63 63 64 64 65 65 66 66 67 67 68 69 69 6or 7 7 7 7 7 7 70 7011 71 71 72 72 73 73 74 74 75 75 76 77 77 78 78 79 7â 7â 8 8 8 8 8 8 8 8 8 8 80 80 80 80 80 80 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 81 82 83 83 84 85 85 86 871â 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 89 9 9 9 9 9 9 9 9 90 91 93 95 97 97â 98 99 9â â