{"id":2738,"date":"2013-05-06T22:05:31","date_gmt":"2013-05-06T14:05:31","guid":{"rendered":"https:\/\/kyle.ai\/blog\/?p=2738"},"modified":"2013-05-06T22:08:38","modified_gmt":"2013-05-06T14:08:38","slug":"%e6%80%8e%e6%a0%b7%e5%86%99%e4%b8%80%e4%b8%aa%e6%8b%bc%e5%86%99%e6%a3%80%e6%9f%a5%e5%99%a8","status":"publish","type":"post","link":"https:\/\/kyle.ai\/blog\/2738.html","title":{"rendered":"\u600e\u6837\u5199\u4e00\u4e2a\u62fc\u5199\u68c0\u67e5\u5668"},"content":{"rendered":"<p><span style=\"font-size: small;\">\u4e0a\u4e2a\u661f\u671f, \u6211\u7684\u4e24\u4e2a\u670b\u53cb Dean \u548c Bill \u5206\u522b\u544a\u8bc9\u6211\u8bf4\u4ed6\u4eec\u5bf9 Google \u7684\u5feb\u901f\u9ad8\u8d28\u91cf\u7684\u62fc\u5199\u68c0\u67e5\u5de5\u5177\u611f\u5230\u60ca\u5947. \u6bd4\u5982\u8bf4\u5728\u641c\u7d22\u7684\u65f6\u5019\u952e\u5165 [speling], \u5728\u4e0d\u5230 0.1 \u79d2\u7684\u65f6\u95f4\u5185, Google \u4f1a\u8fd4\u56de: \u4f60\u8981\u627e\u7684\u662f\u4e0d\u662f [spelling]. (Yahoo! \u548c \u5fae\u8f6f\u4e5f\u6709\u7c7b\u4f3c\u7684\u529f\u80fd). \u8ba9\u6211\u611f\u5230\u6709\u70b9\u5947\u602a\u7684\u662f\u6211\u539f\u60f3 Dean \u548c Bill \u8fd9\u4e24\u4e2a\u5f88\u725b\u7684\u5de5\u7a0b\u5e08\u548c\u6570\u5b66\u5bb6\u5e94\u8be5\u5bf9\u4e8e\u4f7f\u7528\u7edf\u8ba1\u8bed\u8a00\u6a21\u578b\u6784\u5efa\u62fc\u5199\u68c0\u67e5\u5668\u6709\u804c\u4e1a\u7684\u654f\u611f. \u4f46\u662f\u4ed6\u4eec\u4f3c\u4e4e\u6ca1\u6709\u8fd9\u4e2a\u60f3\u6cd5. \u6211\u540e\u6765\u60f3\u4e86\u60f3, \u4ed6\u4eec\u7684\u786e\u6ca1\u4ec0\u4e48\u7406\u7531\u5f88\u719f\u6089\u7edf\u8ba1\u8bed\u8a00\u6a21\u578b. \u4e0d\u662f\u4ed6\u4eec\u7684\u77e5\u8bc6\u6709\u95ee\u9898, \u800c\u662f\u6211\u9884\u60f3\u7684\u672c\u6765\u5c31\u662f\u4e0d\u5bf9\u7684.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u6211\u89c9\u5f97, \u5982\u679c\u5bf9\u8fd9\u65b9\u9762\u7684\u5de5\u4f5c\u505a\u4e2a\u89e3\u91ca, \u4ed6\u4eec\u548c\u5176\u4ed6\u4eba\u80af\u5b9a\u4f1a\u53d7\u76ca. \u7136\u800c\u50cfGoogle \u7684\u90a3\u6837\u5de5\u4e1a\u5f3a\u5ea6\u7684\u62fc\u5199\u68c0\u67e5\u5668\u7684\u5168\u90e8\u7ec6\u8282\u53ea\u4f1a\u8ba9\u4eba\u611f\u5230\u8ff7\u60d1\u800c\u4e0d\u662f\u53d7\u5230\u542f\u8fea. \u524d\u51e0\u5929\u6211\u4e58\u98de\u673a\u56de\u5bb6\u7684\u65f6\u5019, \u987a\u4fbf\u5199\u4e86\u51e0\u5341\u884c\u7a0b\u5e8f, \u4f5c\u4e3a\u4e00\u4e2a\u73a9\u5177\u6027\u8d28\u7684\u62fc\u5199\u68c0\u67e5\u5668. \u8fd9\u4e2a\u62fc\u5199\u68c0\u67e5\u5668\u5927\u7ea61\u79d2\u80fd\u5904\u740610\u591a\u4e2a\u5355\u8bcd, \u5e76\u4e14\u8fbe\u5230 80% -90% \u7684\u51c6\u786e\u7387. \u4e0b\u9762\u5c31\u662f\u6211\u7684\u4ee3\u7801, \u7528Python 2.5 \u5199\u6210, \u4e00\u517121 \u884c, \u662f\u4e00\u4e2a\u529f\u80fd\u5b8c\u5907\u7684\u62fc\u5199\u68c0\u67e5\u5668.<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nimport re, collections\r\n\r\ndef words(text): return re.findall('&#x5B;a-z]+', text.lower())\r\n\r\ndef train(features):\r\n    model = collections.defaultdict(lambda: 1)\r\n    for f in features:\r\n        model&#x5B;f] += 1\r\n    return model\r\n\r\nNWORDS = train(words(file('big.txt').read()))\r\n\r\nalphabet = 'abcdefghijklmnopqrstuvwxyz'\r\n\r\ndef edits1(word):\r\n    n = len(word)\r\n    return set(&#x5B;word&#x5B;0:i]+word&#x5B;i+1:] for i in range(n)] +                     # deletion\r\n               &#x5B;word&#x5B;0:i]+word&#x5B;i+1]+word&#x5B;i]+word&#x5B;i+2:] for i in range(n-1)] + # transposition\r\n               &#x5B;word&#x5B;0:i]+c+word&#x5B;i+1:] for i in range(n) for c in alphabet] + # alteration\r\n               &#x5B;word&#x5B;0:i]+c+word&#x5B;i:] for i in range(n+1) for c in alphabet])  # insertion\r\n\r\ndef known_edits2(word):\r\n    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)\r\n\r\ndef known(words): return set(w for w in words if w in NWORDS)\r\n\r\ndef correct(word):\r\n    candidates = known(&#x5B;word]) or known(edits1(word)) or known_edits2(word) or &#x5B;word]\r\n    return max(candidates, key=lambda w: NWORDS&#x5B;w])\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u8fd9\u6bb5\u4ee3\u7801\u5b9a\u4e49\u4e86\u4e00\u4e2a\u51fd\u6570\u53eb correct, \u5b83\u4ee5\u4e00\u4e2a\u5355\u8bcd\u4f5c\u4e3a\u8f93\u5165\u53c2\u6570, \u8fd4\u56de\u6700\u53ef\u80fd\u7684\u62fc\u5199\u5efa\u8bae\u7ed3\u679c. \u6bd4\u5982\u8bf4:<\/span><\/p>\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\r\n&gt;&gt;&gt; correct('speling')\r\n'spelling'\r\n&gt;&gt;&gt; correct('korrecter')\r\n'corrector'\r\n<\/pre>\n<p><span style=\"font-size: small; color: #ff0000;\">\u62fc\u5199\u68c0\u67e5\u5668\u7684\u539f\u7406, \u4e00\u4e9b\u7b80\u5355\u7684\u6982\u7387\u77e5\u8bc6<\/span><\/p>\n<p><span style=\"font-size: small;\">\u6211\u7b80\u5355\u7684\u4ecb\u7ecd\u4e00\u4e0b\u5b83\u7684\u5de5\u4f5c\u539f\u7406. \u7ed9\u5b9a\u4e00\u4e2a\u5355\u8bcd, \u6211\u4eec\u7684\u4efb\u52a1\u662f\u9009\u62e9\u548c\u5b83\u6700\u76f8\u4f3c\u7684\u62fc\u5199\u6b63\u786e\u7684\u5355\u8bcd. (\u5982\u679c\u8fd9\u4e2a\u5355\u8bcd\u672c\u8eab\u62fc\u5199\u5c31\u662f\u6b63\u786e\u7684, \u90a3\u4e48\u6700\u76f8\u8fd1\u7684\u5c31\u662f\u5b83\u81ea\u5df1\u5566). \u5f53\u7136, \u4e0d\u53ef\u80fd\u7edd\u5bf9\u7684\u627e\u5230\u76f8\u8fd1\u7684\u5355\u8bcd, \u6bd4\u5982\u8bf4\u7ed9\u5b9a lates \u8fd9\u4e2a\u5355\u8bcd, \u5b83\u5e94\u8be5\u522b\u66f4\u6b63\u4e3a late \u5462 \u8fd8\u662f latest \u5462? \u8fd9\u4e9b\u56f0\u96be\u6307\u793a\u6211\u4eec, \u9700\u8981\u4f7f\u7528\u6982\u7387\u8bba, \u800c\u4e0d\u662f\u57fa\u4e8e\u89c4\u5219\u7684\u5224\u65ad. \u6211\u4eec\u8bf4, \u7ed9\u5b9a\u4e00\u4e2a\u8bcd w, \u5728\u6240\u6709\u6b63\u786e\u7684\u62fc\u5199\u8bcd\u4e2d, \u6211\u4eec\u60f3\u8981\u627e\u4e00\u4e2a\u6b63\u786e\u7684\u8bcd c, \u4f7f\u5f97\u5bf9\u4e8e w \u7684\u6761\u4ef6\u6982\u7387\u6700\u5927, \u4e5f\u5c31\u662f\u8bf4:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nargmaxc P(c|w)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u6309\u7167 \u8d1d\u53f6\u65af\u7406\u8bba \u4e0a\u9762\u7684\u5f0f\u5b50\u7b49\u4ef7\u4e8e:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nargmaxc P(w|c) P(c) \/ P(w)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u56e0\u4e3a\u7528\u6237\u53ef\u4ee5\u8f93\u9519\u4efb\u4f55\u8bcd, \u56e0\u6b64\u5bf9\u4e8e\u4efb\u4f55 c \u6765\u8bb2, \u51fa\u73b0 w \u7684\u6982\u7387 P(w) \u90fd\u662f\u4e00\u6837\u7684, \u4ece\u800c\u6211\u4eec\u5728\u4e0a\u5f0f\u4e2d\u5ffd\u7565\u5b83, \u5199\u6210:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nargmaxc P(w|c) P(c)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u8fd9\u4e2a\u5f0f\u5b50\u6709\u4e09\u4e2a\u90e8\u5206, \u4ece\u53f3\u5230\u5de6, \u5206\u522b\u662f:<\/span><\/p>\n<p><span style=\"font-size: small;\">1. P(c), \u6587\u7ae0\u4e2d\u51fa\u73b0\u4e00\u4e2a\u6b63\u786e\u62fc\u5199\u8bcd c \u7684\u6982\u7387, \u4e5f\u5c31\u662f\u8bf4, \u5728\u82f1\u8bed\u6587\u7ae0\u4e2d, c \u51fa\u73b0\u7684\u6982\u7387\u6709\u591a\u5927\u5462? \u56e0\u4e3a\u8fd9\u4e2a\u6982\u7387\u5b8c\u5168\u7531\u82f1\u8bed\u8fd9\u79cd\u8bed\u8a00\u51b3\u5b9a, \u6211\u4eec\u79f0\u4e4b\u4e3a\u505a\u8bed\u8a00\u6a21\u578b. \u597d\u6bd4\u8bf4, \u82f1\u8bed\u4e2d\u51fa\u73b0 the \u7684\u6982\u7387 P(&#8216;the&#8217;) \u5c31\u76f8\u5bf9\u9ad8, \u800c\u51fa\u73b0 P(&#8216;zxzxzxzyy&#8217;) \u7684\u6982\u7387\u63a5\u8fd10(\u5047\u8bbe\u540e\u8005\u4e5f\u662f\u4e00\u4e2a\u8bcd\u7684\u8bdd).<\/span><\/p>\n<p><span style=\"font-size: small;\">2. P(w|c), \u5728\u7528\u6237\u60f3\u952e\u5165 c \u7684\u60c5\u51b5\u4e0b\u6572\u6210 w \u7684\u6982\u7387. \u56e0\u4e3a\u8fd9\u4e2a\u662f\u4ee3\u8868\u7528\u6237\u4f1a\u4ee5\u591a\u5927\u7684\u6982\u7387\u628a c \u6572\u9519\u6210 w, \u56e0\u6b64\u8fd9\u4e2a\u88ab\u79f0\u4e3a\u8bef \u5dee\u6a21\u578b.<\/span><\/p>\n<p><span style=\"font-size: small;\">3. argmaxc, \u7528\u6765\u679a\u4e3e\u6240\u6709\u53ef\u80fd\u7684 c \u5e76\u4e14\u9009\u53d6\u6982\u7387\u6700\u5927\u7684, \u56e0\u4e3a\u6211\u4eec\u6709\u7406\u7531\u76f8\u4fe1, \u4e00\u4e2a(\u6b63\u786e\u7684)\u5355\u8bcd\u51fa\u73b0\u7684\u9891\u7387\u9ad8, \u7528\u6237\u53c8\u5bb9\u6613\u628a\u5b83\u6572\u6210\u53e6\u4e00\u4e2a\u9519\u8bef\u7684\u5355\u8bcd, \u90a3\u4e48, \u90a3\u4e2a\u6572\u9519\u7684\u5355\u8bcd\u5e94\u8be5\u88ab\u66f4\u6b63\u4e3a\u8fd9\u4e2a\u6b63\u786e\u7684.<\/span><br \/>\n<span style=\"font-size: small;\"> \u6709\u4eba\u80af\u5b9a\u8981\u95ee, \u4f60\u7b28\u554a, \u4e3a\u4ec0\u4e48\u628a\u6700\u7b80\u5355\u7684\u4e00\u4e2a P(c|w) \u53d8\u6210\u4e24\u9879\u590d\u6742\u7684\u5f0f\u5b50\u6765\u8ba1\u7b97? \u7b54\u6848\u662f\u672c\u8d28\u4e0a P(c|w) \u5c31\u662f\u548c\u8fd9\u4e24\u9879\u540c\u65f6\u76f8\u5173\u7684, \u56e0\u6b64\u62c6\u6210\u4e24\u9879\u53cd\u800c\u5bb9\u6613\u5904\u7406. \u4e3e\u4e2a\u4f8b\u5b50, \u6bd4\u5982\u4e00\u4e2a\u5355\u8bcd thew \u62fc\u9519\u4e86. \u770b\u4e0a\u53bb thaw \u5e94\u8be5\u662f\u6b63\u786e\u7684, \u56e0\u4e3a\u5c31\u662f\u628a a \u6253\u6210 e \u4e86. \u7136\u800c, \u4e5f\u6709\u53ef\u80fd\u7528\u6237\u60f3\u8981\u7684\u662f the, \u56e0\u4e3a the \u662f\u82f1\u8bed\u4e2d\u5e38\u89c1\u7684\u4e00\u4e2a\u8bcd, \u5e76\u4e14\u5f88\u6709\u53ef\u80fd\u6253\u5b57\u65f6\u5019\u624b\u4e0d\u5c0f\u5fc3\u4ece e \u6ed1\u5230 w \u4e86. \u56e0\u6b64, \u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b, \u6211\u4eec\u60f3\u8981\u8ba1\u7b97 P(c|w), \u5c31\u5fc5\u987b\u540c\u65f6\u8003\u8651 c \u51fa\u73b0\u7684\u6982\u7387\u548c\u4ece c \u5230 w \u7684\u6982\u7387. \u628a\u4e00\u9879\u62c6\u6210\u4e24\u9879\u53cd\u800c\u8ba9\u8fd9\u4e2a\u95ee\u9898\u66f4\u52a0\u5bb9\u6613\u66f4\u52a0\u6e05\u6670.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u73b0\u5728, \u8ba9\u6211\u4eec\u770b\u770b\u7a0b\u5e8f\u7a76\u7adf\u662f\u600e\u4e48\u4e00\u56de\u4e8b. \u9996\u5148\u662f\u8ba1\u7b97 P(c), \u6211\u4eec\u53ef\u4ee5\u8bfb\u5165\u4e00\u4e2a\u5de8\u5927\u7684\u6587\u672c\u6587\u4ef6, big.txt, \u8fd9\u4e2a\u91cc\u9762\u5927\u7ea6\u6709\u51e0\u767e\u4e07\u4e2a\u8bcd(\u76f8\u5f53\u4e8e\u662f\u8bed\u6599\u5e93\u4e86). \u8fd9\u4e2a\u6587\u4ef6\u662f\u7531Gutenberg \u8ba1\u5212 \u4e2d\u53ef\u4ee5\u83b7\u53d6\u7684\u4e00\u4e9b\u4e66, Wiktionary \u548c British National Corpus \u8bed\u6599\u5e93\u6784\u6210. (\u5f53\u65f6\u5728\u98de\u673a\u4e0a\u6211\u53ea\u6709\u798f\u5c14\u6469\u65af\u5168\u96c6, \u6211\u540e\u6765\u53c8\u52a0\u5165\u4e86\u4e00\u4e9b, \u76f4\u5230\u6548\u679c\u4e0d\u518d\u663e\u8457\u63d0\u9ad8\u4e3a\u6b62).<\/span><\/p>\n<p><span style=\"font-size: small;\">\u7136\u540e, \u6211\u4eec\u5229\u7528\u4e00\u4e2a\u53eb words \u7684\u51fd\u6570\u628a\u8bed\u6599\u4e2d\u7684\u5355\u8bcd\u5168\u90e8\u62bd\u53d6\u51fa\u6765, \u8f6c\u6210\u5c0f\u5199, \u5e76\u4e14\u53bb\u9664\u5355\u8bcd\u4e2d\u95f4\u7684\u7279\u6b8a\u7b26\u53f7. \u8fd9\u6837, \u5355\u8bcd\u5c31\u4f1a\u6210\u4e3a\u5b57\u6bcd\u5e8f\u5217, don&#8217;t \u5c31\u53d8\u6210 don \u548c t \u4e86.1 \u63a5\u7740\u6211\u4eec\u8bad\u7ec3\u4e00\u4e2a\u6982\u7387\u6a21\u578b, \u522b\u88ab\u8fd9\u4e2a\u672f\u8bed\u5413\u5012, \u5b9e\u9645\u4e0a\u5c31\u662f\u6570\u4e00\u6570\u6bcf\u4e2a\u5355\u8bcd\u51fa\u73b0\u51e0\u6b21. \u5728 train \u51fd\u6570\u4e2d, \u6211\u4eec\u5c31\u505a\u8fd9\u4e2a\u4e8b\u60c5.<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef words(text): return re.findall('&#x5B;a-z]+', text.lower())\r\n\r\ndef train(features):\r\n    model = collections.defaultdict(lambda: 1)\r\n    for f in features:\r\n        model&#x5B;f] += 1\r\n    return model\r\n\r\nNWORDS = train(words(file('big.txt').read()))\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u5b9e\u9645\u4e0a, NWORDS[w] \u5b58\u50a8\u4e86\u5355\u8bcd w \u5728\u8bed\u6599\u4e2d\u51fa\u73b0\u4e86\u591a\u5c11\u6b21. \u4e0d\u8fc7\u4e00\u4e2a\u95ee\u9898\u662f\u8981\u662f\u9047\u5230\u6211\u4eec\u4ece\u6765\u6ca1\u6709\u8fc7\u89c1\u8fc7\u7684\u65b0\u8bcd\u600e\u4e48\u529e. \u5047\u5982\u8bf4\u4e00\u4e2a\u8bcd\u62fc\u5199\u5b8c\u5168\u6b63\u786e, \u4f46\u662f\u8bed\u6599\u5e93\u4e2d\u6ca1\u6709\u5305\u542b\u8fd9\u4e2a\u8bcd, \u4ece\u800c\u8fd9\u4e2a\u8bcd\u4e5f\u6c38\u8fdc\u4e0d\u4f1a\u51fa\u73b0\u5728\u8bad\u7ec3\u96c6\u4e2d. \u4e8e\u662f, \u6211\u4eec\u5c31\u8981\u8fd4\u56de\u51fa\u73b0\u8fd9\u4e2a\u8bcd\u7684\u6982\u7387\u662f0. \u8fd9\u4e2a\u60c5\u51b5\u4e0d\u592a\u5999, \u56e0\u4e3a\u6982\u7387\u4e3a0\u8fd9\u4e2a\u4ee3\u8868\u4e86\u8fd9\u4e2a\u4e8b\u4ef6\u7edd\u5bf9\u4e0d\u53ef\u80fd\u53d1\u751f, \u800c\u5728\u6211\u4eec\u7684\u6982\u7387\u6a21\u578b\u4e2d, \u6211\u4eec\u671f\u671b\u7528\u4e00\u4e2a\u5f88\u5c0f\u7684\u6982\u7387\u6765\u4ee3\u8868\u8fd9\u79cd\u60c5\u51b5. \u5b9e\u9645\u4e0a\u5904\u7406\u8fd9\u4e2a\u95ee\u9898\u6709\u5f88\u591a\u6210\u578b\u7684\u6807\u51c6\u65b9\u6cd5, \u6211\u4eec\u9009\u53d6\u4e00\u4e2a\u6700\u7b80\u5355\u7684\u65b9\u6cd5: \u4ece\u6765\u6ca1\u6709\u8fc7\u89c1\u8fc7\u7684\u65b0\u8bcd\u4e00\u5f8b\u5047\u8bbe\u51fa\u73b0\u8fc7\u4e00\u6b21. \u8fd9\u4e2a\u8fc7\u7a0b\u4e00\u822c\u6210\u4e3a\u201d\u5e73\u6ed1\u5316\u201d, \u56e0\u4e3a\u6211\u4eec\u628a\u6982\u7387\u5206\u5e03\u4e3a0\u7684\u8bbe\u7f6e\u4e3a\u4e00\u4e2a\u5c0f\u7684\u6982\u7387\u503c. \u5728\u8bed\u8a00\u5b9e\u73b0\u4e0a, \u6211\u4eec\u53ef\u4ee5\u4f7f\u7528Python collention \u5305\u4e2d\u7684 defaultdict \u7c7b, \u8fd9\u4e2a\u7c7b\u548c python \u6807\u51c6\u7684 dict (\u5176\u4ed6\u8bed\u8a00\u4e2d\u53ef\u80fd\u79f0\u4e4b\u4e3a hash \u8868) \u4e00\u6837, \u552f\u4e00\u7684\u4e0d\u540c\u5c31\u662f\u53ef\u4ee5\u7ed9\u4efb\u610f\u7684\u952e\u8bbe\u7f6e\u4e00\u4e2a\u9ed8\u8ba4\u503c, \u5728\u6211\u4eec\u7684\u4f8b\u5b50\u4e2d, \u6211\u4eec\u4f7f\u7528\u4e00\u4e2a\u533f\u540d\u7684 lambda:1 \u51fd\u6570, \u8bbe\u7f6e\u9ed8\u8ba4\u503c\u4e3a 1.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u7136\u540e\u7684\u95ee\u9898\u662f: \u7ed9\u5b9a\u4e00\u4e2a\u5355\u8bcd w, \u600e\u4e48\u80fd\u591f\u679a\u4e3e\u6240\u6709\u53ef\u80fd\u7684\u6b63\u786e\u7684\u62fc\u5199\u5462? \u5b9e\u9645\u4e0a\u524d\u4eba\u5df2\u7ecf\u7814\u7a76\u5f97\u5f88\u5145\u5206\u4e86, \u8fd9\u4e2a\u5c31\u662f\u4e00\u4e2a\u7f16\u8f91\u8ddd\u79bb\u7684\u6982 \u5ff5. \u8fd9\u4e24\u4e2a\u8bcd\u4e4b\u95f4\u7684\u7f16\u8f91\u8ddd\u79bb\u5b9a\u4e49\u4e3a\u4f7f\u7528\u4e86\u51e0\u6b21\u63d2\u5165(\u5728\u8bcd\u4e2d\u63d2\u5165\u4e00\u4e2a\u5355\u5b57\u6bcd), \u5220\u9664(\u5220\u9664\u4e00\u4e2a\u5355\u5b57\u6bcd), \u4ea4\u6362(\u4ea4\u6362\u76f8\u90bb\u4e24\u4e2a\u5b57\u6bcd), \u66ff\u6362(\u628a\u4e00\u4e2a\u5b57\u6bcd\u6362\u6210\u53e6\u4e00\u4e2a)\u7684\u64cd\u4f5c\u4ece\u4e00\u4e2a\u8bcd\u53d8\u5230\u53e6\u4e00\u4e2a\u8bcd.<\/span><br \/>\n<span style=\"font-size: small;\"> \u4e0b\u9762\u8fd9\u4e2a\u51fd\u6570\u53ef\u4ee5\u8fd4\u56de\u6240\u6709\u4e0e\u5355\u8bcd w \u7f16\u8f91\u8ddd\u79bb\u4e3a 1 \u7684\u96c6\u5408.<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef edits1(word):\r\n    n = len(word)\r\n    return set(&#x5B;word&#x5B;0:i]+word&#x5B;i+1:] for i in range(n)] +                     # deletion\r\n               &#x5B;word&#x5B;0:i]+word&#x5B;i+1]+word&#x5B;i]+word&#x5B;i+2:] for i in range(n-1)] + # transposition\r\n               &#x5B;word&#x5B;0:i]+c+word&#x5B;i+1:] for i in range(n) for c in alphabet] + # alteration\r\n               &#x5B;word&#x5B;0:i]+c+word&#x5B;i:] for i in range(n+1) for c in alphabet])  # insertion\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u663e\u7136, \u8fd9\u4e2a\u96c6\u5408\u5f88\u5927. \u5bf9\u4e8e\u4e00\u4e2a\u957f\u5ea6\u4e3a n \u7684\u5355\u8bcd, \u53ef\u80fd\u6709n\u79cd\u5220\u9664, n-1\u4e2d\u5bf9\u6362, 26n \u79cd (\u8bd1\u6ce8: \u5b9e\u9645\u4e0a\u662f 25n \u79cd)\u66ff\u6362 \u548c 26(n+1) \u79cd\u63d2\u5165 (\u8bd1\u6ce8: \u5b9e\u9645\u4e0a\u6bd4\u8fd9\u4e2a\u5c0f, \u56e0\u4e3a\u5728\u4e00\u4e2a\u5b57\u6bcd\u524d\u540e\u518d\u63d2\u5165\u8fd9\u4e2a\u5b57\u6bcd\u6784\u6210\u7684\u8bcd\u662f\u7b49\u4ef7\u7684). \u8fd9\u6837\u7684\u8bdd, \u4e00\u5171\u5c31\u662f 54n + 25 \u4e2d\u60c5\u51b5 (\u5f53\u4e2d\u8fd8\u6709\u4e00\u70b9\u91cd\u590d). \u6bd4\u5982\u8bf4, \u548c something \u8fd9\u4e2a\u5355\u8bcd\u7684\u7f16\u8f91\u8ddd\u79bb\u4e3a1 \u7684\u8bcd\u6309\u7167\u8fd9\u4e2a\u7b97\u6765\u662f 511 \u4e2a, \u800c\u5b9e\u9645\u4e0a\u662f 494 \u4e2a.<\/span><br \/>\n<span style=\"font-size: small;\"> \u4e00\u822c\u8bb2\u62fc\u5199\u68c0\u67e5\u7684\u6587\u732e\u5ba3\u79f0\u5927\u7ea680-95%\u7684\u62fc\u5199\u9519\u8bef\u90fd\u662f\u4ecb\u4e8e\u7f16\u8bd1\u8ddd\u79bb 1 \u4ee5\u5185. \u7136\u800c\u4e0b\u9762\u6211\u4eec\u770b\u5230, \u5f53\u6211\u5bf9\u4e8e\u4e00\u4e2a\u6709270\u4e2a\u62fc\u5199\u9519\u8bef\u7684\u8bed\u6599\u505a\u5b9e\u9a8c\u7684\u65f6\u5019, \u6211\u53d1\u73b0\u53ea\u670976%\u7684\u62fc\u5199\u9519\u8bef\u662f\u5c5e\u4e8e\u7f16\u8f91\u8ddd\u79bb\u4e3a1\u7684\u96c6\u5408. \u6216\u8bb8\u662f\u6211\u9009\u53d6\u7684\u4f8b\u5b50\u6bd4\u5178\u578b\u7684\u4f8b\u5b50\u96be\u5904\u7406\u4e00\u70b9\u5427. \u4e0d\u7ba1\u600e\u6837, \u6211\u89c9\u5f97\u8fd9\u4e2a\u7ed3\u679c\u4e0d\u591f\u597d, \u56e0\u6b64\u6211\u5f00\u59cb\u8003\u8651\u7f16\u8f91\u8ddd\u79bb\u4e3a 2 \u7684\u90a3\u4e9b\u5355\u8bcd\u4e86. \u8fd9\u4e2a\u4e8b\u60c5\u5f88\u7b80\u5355, \u9012\u5f52\u7684\u6765\u770b, \u5c31\u662f\u628a edit1 \u51fd\u6570\u518d\u4f5c\u7528\u5728 edit1 \u51fd\u6570\u7684\u8fd4\u56de\u96c6\u5408\u7684\u6bcf\u4e00\u4e2a\u5143\u7d20\u4e0a\u5c31\u884c\u4e86. \u56e0\u6b64, \u6211\u4eec\u5b9a\u4e49\u51fd\u6570 edit2:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef edits2(word):\r\n    return set(e2 for e1 in edits1(word) for e2 in edits1(e1))\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u8fd9\u4e2a\u8bed\u53e5\u5199\u8d77\u6765\u5f88\u7b80\u5355, \u5b9e\u9645\u4e0a\u80cc\u540e\u662f\u5f88\u5e9e\u5927\u7684\u8ba1\u7b97\u91cf: \u4e0e something \u7f16\u8f91\u8ddd\u79bb\u4e3a2\u7684\u5355\u8bcd\u5c45\u7136\u8fbe\u5230\u4e86 114,324 \u4e2a. \u4e0d\u8fc7\u7f16\u8f91\u8ddd\u79bb\u653e\u5bbd\u52302\u4ee5\u540e, \u6211\u4eec\u57fa\u672c\u4e0a\u5c31\u80fd\u8986\u76d6\u6240\u6709\u7684\u60c5\u51b5\u4e86, \u5728270\u4e2a\u6837\u4f8b\u4e2d, \u53ea\u67093\u4e2a\u7684\u7f16\u8f91\u8ddd\u79bb\u5927\u4e8e2. \u5f53\u7136\u6211\u4eec\u53ef\u4ee5\u505a\u4e00\u4e9b\u5c0f\u5c0f\u7684\u4f18\u5316: \u5728\u8fd9\u4e9b\u7f16\u8f91\u8ddd\u79bb\u5c0f\u4e8e2\u7684\u8bcd\u4e2d\u95f4, \u53ea\u628a\u90a3\u4e9b\u6b63\u786e\u7684\u8bcd\u4f5c\u4e3a\u5019\u9009\u8bcd. \u6211\u4eec\u4ecd\u7136\u8003\u8651\u6240\u6709\u7684\u53ef\u80fd\u6027, \u4f46\u662f\u4e0d\u9700\u8981\u6784\u5efa\u4e00\u4e2a\u5f88\u5927\u7684\u96c6\u5408, \u56e0\u6b64, \u6211\u4eec\u6784\u5efa\u4e00\u4e2a\u51fd\u6570\u53eb\u505a known_edits2, \u8fd9\u4e2a\u51fd\u6570\u53ea\u8fd4\u56de\u90a3\u4e9b\u6b63\u786e\u7684\u5e76\u4e14\u4e0e w \u7f16\u8f91\u8ddd\u79bb\u5c0f\u4e8e2 \u7684\u8bcd\u7684\u96c6\u5408:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef known_edits2(word):\r\n    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u73b0\u5728, \u5728\u521a\u624d\u7684 something \u4f8b\u5b50\u4e2d, known_edits2(&#8216;something&#8217;) \u53ea\u80fd\u8fd4\u56de 3 \u4e2a\u5355\u8bcd: &#8216;smoothing&#8217;, &#8216;something&#8217; \u548c &#8216;soothing&#8217;, \u800c\u5b9e\u9645\u4e0a\u6240\u6709\u7f16\u8f91\u8ddd\u79bb\u4e3a 1 \u6216\u8005 2 \u7684\u8bcd\u4e00\u5171\u6709 114,324 \u4e2a. \u8fd9\u4e2a\u4f18\u5316\u5927\u7ea6\u628a\u901f\u5ea6\u63d0\u9ad8\u4e86 10%.<\/span><br \/>\n<span style=\"font-size: small;\"> \u6700\u540e\u5269\u4e0b\u7684\u5c31\u662f\u8bef\u5dee\u6a21\u578b\u90e8\u5206 P(w|c) \u4e86. \u8fd9\u4e2a\u4e5f\u662f\u5f53\u65f6\u96be\u4f4f\u6211\u7684\u90e8\u5206. \u5f53\u65f6\u6211\u5728\u98de\u673a\u4e0a, \u6ca1\u6709\u7f51\u7edc, \u4e5f\u5c31\u6ca1\u6709\u6570\u636e\u7528\u6765\u6784\u5efa\u4e00\u4e2a\u62fc\u5199\u9519\u8bef\u6a21\u578b. \u4e0d\u8fc7\u6211\u6709\u4e00\u4e9b\u5e38\u8bc6\u6027\u7684\u77e5\u8bc6: \u628a\u4e00\u4e2a\u5143\u97f3\u62fc\u6210\u53e6\u4e00\u4e2a\u7684\u6982\u7387\u8981\u5927\u4e8e\u8f85\u97f3 (\u56e0\u4e3a\u4eba\u5e38\u5e38\u628a hello \u6253\u6210 hallo \u8fd9\u6837); \u628a\u5355\u8bcd\u7684\u7b2c\u4e00\u4e2a\u5b57\u6bcd\u62fc\u9519\u7684\u6982\u7387\u4f1a\u76f8\u5bf9\u5c0f, \u7b49\u7b49. \u4f46\u662f\u6211\u5e76\u6ca1\u6709\u5177\u4f53\u7684\u6570\u5b57\u53bb\u652f\u6491\u8fd9\u4e9b\u8bc1\u636e. \u56e0\u6b64, \u6211\u9009\u62e9\u4e86\u4e00\u4e2a\u7b80\u5355\u7684\u65b9\u6cd5: \u7f16\u8f91\u8ddd\u79bb\u4e3a1\u7684\u6b63\u786e\u5355\u8bcd\u6bd4\u7f16\u8f91\u8ddd\u79bb\u4e3a2\u7684\u4f18\u5148\u7ea7\u9ad8, \u800c\u7f16\u8f91\u8ddd\u79bb\u4e3a0\u7684\u6b63\u786e\u5355\u8bcd\u4f18\u5148\u7ea7\u6bd4\u7f16\u8f91\u8ddd\u79bb\u4e3a1\u7684\u9ad8. \u56e0\u6b64, \u7528\u4ee3\u7801\u5199\u51fa\u6765\u5c31\u662f:<\/span><\/p>\n<p><span style=\"font-size: small;\">(\u8bd1\u6ce8: \u6b64\u5904\u4f5c\u8005\u4f7f\u7528\u4e86Python\u8bed\u8a00\u7684\u4e00\u4e2a\u5de7\u5999\u6027\u8d28: \u77ed\u8def\u8868\u8fbe\u5f0f. \u5728\u4e0b\u9762\u7684\u4ee3\u7801\u4e2d, \u5982\u679cknown(set)\u975e\u7a7a, candidate \u5c31\u4f1a\u9009\u53d6\u8fd9\u4e2a\u96c6\u5408, \u800c\u4e0d\u7ee7\u7eed\u8ba1\u7b97\u540e\u9762\u7684; \u56e0\u6b64, \u901a\u8fc7Python\u8bed\u8a00\u7684\u77ed\u8def\u8868\u8fbe\u5f0f, \u4f5c\u8005\u5f88\u7b80\u5355\u7684\u5b9e\u73b0\u4e86\u4f18\u5148\u7ea7)<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ndef known(words): return set(w for w in words if w in NWORDS)\r\n\r\ndef correct(word):\r\n    candidates = known(&#x5B;word]) or known(edits1(word)) or known_edits2(word) or &#x5B;word]\r\n    return max(candidates, key=lambda w: NWORDS&#x5B;w])\r\n<\/pre>\n<p><span style=\"font-size: small;\">correct \u51fd\u6570\u4ece\u4e00\u4e2a\u5019\u9009\u96c6\u5408\u4e2d\u9009\u53d6\u6700\u5927\u6982\u7387\u7684. \u5b9e\u9645\u4e0a, \u5c31\u662f\u9009\u53d6\u6709\u6700\u5927 P(c) \u503c\u7684\u90a3\u4e2a. \u6240\u6709\u7684 P(c) \u503c\u90fd\u5b58\u50a8\u5728 NWORDS \u7ed3\u6784\u4e2d.<\/span><\/p>\n<p><span style=\"font-size: small; color: #ff0000;\">\u6548\u679c<\/span><\/p>\n<p><span style=\"font-size: small;\">\u73b0\u5728\u6211\u4eec\u770b\u770b\u7b97\u6cd5\u6548\u679c\u600e\u4e48\u6837. \u5728\u98de\u673a\u4e0a\u6211\u5c1d\u8bd5\u4e86\u597d\u51e0\u4e2a\u4f8b\u5b50, \u6548\u679c\u8fd8\u884c. \u98de\u673a\u7740\u9646\u540e, \u6211\u4ece\u725b\u6d25\u6587\u672c\u6863\u6848\u5e93 (Oxford Text Archive)\u4e0b\u8f7d\u4e86 Roger Mitton \u7684 Birkbeck \u62fc\u5199\u9519\u8bef\u8bed\u6599\u5e93. \u4ece\u8fd9\u4e2a\u5e93\u4e2d, \u6211\u53d6\u51fa\u4e86\u4e24\u4e2a\u96c6\u5408, \u4f5c\u4e3a\u6211\u8981\u505a\u62fc\u5199\u68c0\u67e5\u7684\u76ee\u6807. \u7b2c\u4e00\u4e2a\u96c6\u5408\u7528\u6765\u4f5c\u4e3a\u5728\u5f00\u53d1\u4e2d\u4f5c\u4e3a\u53c2\u8003, \u7b2c\u4e8c\u4e2a\u4f5c\u4e3a\u6700\u540e\u7684\u7ed3\u679c\u6d4b\u8bd5. \u4e5f\u5c31\u662f\u8bf4, \u6211\u7a0b\u5e8f\u5b8c\u6210\u4e4b\u524d\u4e0d\u53c2\u8003\u5b83, \u800c\u628a\u7a0b\u5e8f\u5728\u5176\u4e0a\u7684\u6d4b\u8bd5\u7ed3\u679c\u4f5c\u4e3a\u6700\u540e\u7684\u6548\u679c. \u7528\u4e24\u4e2a\u96c6\u5408\u4e00\u4e2a\u8bad\u7ec3\u4e00\u4e2a\u5bf9\u7167\u662f\u4e00\u79cd\u826f\u597d\u7684\u5b9e\u8df5, \u81f3\u5c11\u8fd9\u6837\u53ef\u4ee5\u907f\u514d\u6211\u901a\u8fc7\u5bf9\u7279\u5b9a\u6570\u636e\u96c6\u5408\u8fdb\u884c\u7279\u6b8a\u8c03\u6574\u4ece\u800c\u81ea\u6b3a\u6b3a\u4eba. \u8fd9\u91cc\u6211\u7ed9\u51fa\u4e86\u4e00\u4e2a\u6d4b\u8bd5\u7684\u4f8b\u5b50\u548c\u4e00\u4e2a\u8fd0\u884c\u6d4b\u8bd5\u7684\u4f8b\u5b50. \u5b9e\u9645\u7684\u5b8c\u6574\u6d4b\u8bd5\u4f8b\u5b50\u548c\u7a0b\u5e8f\u53ef\u4ee5\u53c2\u89c1 spell.py.<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ntests1 = { 'access': 'acess', 'accessing': 'accesing', 'accommodation':\r\n    'accomodation acommodation acomodation', 'account': 'acount', ...}\r\n\r\ntests2 = {'forbidden': 'forbiden', 'decisions': 'deciscions descisions',\r\n    'supposedly': 'supposidly', 'embellishing': 'embelishing', ...}\r\n\r\ndef spelltest(tests, bias=None, verbose=False):\r\n    import time\r\n    n, bad, unknown, start = 0, 0, 0, time.clock()\r\n    if bias:\r\n        for target in tests: NWORDS&#x5B;target] += bias\r\n    for target,wrongs in tests.items():\r\n        for wrong in wrongs.split():\r\n            n += 1\r\n            w = correct(wrong)\r\n            if w!=target:\r\n                bad += 1\r\n                unknown += (target not in NWORDS)\r\n                if verbose:\r\n                    print '%r =&gt; %r (%d); expected %r (%d)' % (\r\n                        wrong, w, NWORDS&#x5B;w], target, NWORDS&#x5B;target])\r\n    return dict(bad=bad, n=n, bias=bias, pct=int(100. - 100.*bad\/n),\r\n                unknown=unknown, secs=int(time.clock()-start) )\r\n\r\nprint spelltest(tests1)\r\nprint spelltest(tests2) ## only do this after everything is debugged\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u8fd9\u4e2a\u7a0b\u5e8f\u7ed9\u51fa\u4e86\u4e0b\u9762\u7684\u8f93\u51fa:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n{'bad': 68, 'bias': None, 'unknown': 15, 'secs': 16, 'pct': 74, 'n': 270}\r\n{'bad': 130, 'bias': None, 'unknown': 43, 'secs': 26, 'pct': 67, 'n': 400}\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u5728270\u4e2a\u6d4b\u8bd5\u6837\u672c\u4e0a 270 , \u6211\u5927\u7ea6\u80fd\u572813\u79d2\u5185\u5f97\u5230 74% \u7684\u6b63\u786e\u7387 (\u6bcf\u79d217\u4e2a\u6b63\u786e\u8bcd), \u5728\u6d4b\u8bd5\u96c6\u4e0a, \u6211\u5f97\u5230 67% \u6b63\u786e\u7387 (\u6bcf\u79d2 15 \u4e2a).<\/span><\/p>\n<p><span style=\"font-size: small;\">\u66f4\u65b0: \u5728\u8fd9\u7bc7\u6587\u7ae0\u7684\u539f\u6765\u7248\u672c\u4e2d, \u6211\u628a\u7ed3\u679c\u9519\u8bef\u7684\u62a5\u544a\u9ad8\u4e86. \u539f\u56e0\u662f\u7a0b\u5e8f\u4e2d\u4e00\u4e2a\u5c0fbug. \u867d\u7136\u8fd9\u4e2a bug \u5f88\u4e0d\u8d77\u773c, \u4f46\u6211\u5b9e\u9645\u4e0a\u5e94\u8be5\u80fd\u591f\u907f\u514d. \u6211\u4e3a\u5bf9\u9605\u8bfb\u6211\u8001\u7248\u672c\u7684\u8fd9\u7bc7\u6587\u7ae0\u7684\u8bfb\u8005\u9020\u6210\u611f\u5230\u62b1\u6b49. \u5728 spelltest \u6e90\u7a0b\u5e8f\u7684\u7b2c\u56db\u884c, \u6211\u5ffd\u7565\u4e86if bias: \u5e76\u4e14\u628a bias \u9ed8\u8ba4\u503c\u8d4b\u503c\u4e3a0. \u6211\u539f\u6765\u60f3: \u5982\u679c bias \u662f0 , NWORDS[target] += bias \u8fd9\u4e2a\u8bed\u53e5\u5c31\u4e0d\u8d77\u4f5c\u7528. \u800c\u5b9e\u9645\u4e0a, \u867d\u7136\u8fd9\u4e2a\u8bed\u53e5\u6ca1\u6709\u6539\u53d8 NWORDS[target] \u7684\u503c, \u8fd9\u4e2a\u5374\u8ba9 (target in NWORDS) \u4e3a\u771f. \u8fd9\u6837\u7684\u8bdd, spelltest \u5c31\u4f1a\u628a\u8bad\u7ec3\u96c6\u5408\u4e2d\u90a3\u4e9b\u4e0d\u8ba4\u8bc6\u7684\u6b63\u786e\u62fc\u5199\u7684\u5355\u8bcd\u90fd\u5f53\u6210\u8ba4\u8bc6\u6765\u5904\u7406\u4e86, \u7a0b\u5e8f\u5c31\u4f1a&#8221;\u4f5c\u5f0a&#8221;. \u6211\u5f88\u559c\u6b22 defaultdict \u7684\u7b80\u6d01, \u6240\u4ee5\u5728\u7a0b\u5e8f\u4e2d\u4f7f\u7528\u4e86\u5b83, \u5982\u679c\u4f7f\u7528 dicts \u5c31\u4e0d\u4f1a\u6709\u8fd9\u4e2a\u95ee\u9898\u4e86.<\/span><br \/>\n<span style=\"font-size: small;\"> \u7ed3\u8bba: \u6211\u8fbe\u5230\u4e86\u7b80\u6d01, \u5feb\u901f\u5f00\u53d1\u548c\u8fd0\u884c\u901f\u5ea6\u8fd9\u4e09\u4e2a\u76ee\u6807, \u4e0d\u8fc7\u51c6\u786e\u7387\u4e0d\u7b97\u592a\u597d.<\/span><\/p>\n<p><span style=\"font-size: small; color: #ff0000;\">\u5c06\u6765\u5de5\u4f5c<\/span><\/p>\n<p><span style=\"font-size: small;\">\u600e\u6837\u624d\u80fd\u505a\u5230\u66f4\u597d\u7ed3\u679c\u5462? \u8ba9\u6211\u4eec\u56de\u8fc7\u5934\u6765\u770b\u770b\u6982\u7387\u6a21\u578b\u4e2d\u7684\u4e09\u4e2a\u56e0\u7d20: (1) P(c); (2) P(w|c); and (3) argmaxc. \u6211\u4eec\u901a\u8fc7\u7a0b\u5e8f\u7ed9\u51fa\u9519\u8bef\u7b54\u6848\u7684\u90a3\u4e9b\u4f8b\u5b50\u5165\u624b, \u770b\u770b\u8fd9\u4e09\u4e2a\u56e0\u7d20\u5916, \u6211\u4eec\u8fd8\u5ffd\u7565\u4e86\u4ec0\u4e48.<\/span><\/p>\n<p><span style=\"font-size: small;\">P(c), \u8bed\u8a00\u6a21\u578b. \u5728\u8bed\u8a00\u6a21\u578b\u4e2d, \u6709\u4e24\u79cd\u95ee\u9898\u4f1a\u9020\u6210\u6700\u540e\u7684\u9519\u8bef\u8bc6\u522b. \u5176\u4e2d\u6700\u4e25\u91cd\u7684\u4e00\u4e2a\u56e0\u7d20\u5c31\u662f \u672a\u77e5\u5355\u8bcd. \u5728\u8bad\u7ec3\u96c6\u5408\u4e2d, \u4e00\u5171\u670915\u4e2a\u672a\u77e5\u5355\u8bcd, \u5b83\u4eec\u5927\u7ea6\u5360\u4e865%; \u5728\u6d4b\u8bd5\u96c6\u5408\u4e2d, \u670943\u4e2a\u672a\u77e5\u8bcd, \u5b83\u4eec\u5360\u4e8611%. \u5f53\u628a spelltest \u7684\u8c03\u7528\u53c2\u6570 verbose \u8bbe\u7f6e\u4e3a True \u7684\u65f6\u5019: \u6211\u4eec\u53ef\u4ee5\u770b\u5230\u4e0b\u9762\u7684\u8f93\u51fa:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('economtric') =&gt; 'economic' (121); expected 'econometric' (1)\r\ncorrect('embaras') =&gt; 'embargo' (8); expected 'embarrass' (1)\r\ncorrect('colate') =&gt; 'coat' (173); expected 'collate' (1)\r\ncorrect('orentated') =&gt; 'orentated' (1); expected 'orientated' (1)\r\ncorrect('unequivocaly') =&gt; 'unequivocal' (2); expected 'unequivocally' (1)\r\ncorrect('generataed') =&gt; 'generate' (2); expected 'generated' (1)\r\ncorrect('guidlines') =&gt; 'guideline' (2); expected 'guidelines' (1)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u5728\u8fd9\u4e2a\u7ed3\u679c\u4e2d, \u6211\u4eec\u53ef\u4ee5\u4f7f\u7528\u770b\u5230 correct \u51fd\u6570\u4f5c\u7528\u5728\u90a3\u4e9b\u62fc\u9519\u7684\u5355\u8bcd\u4e0a\u7684\u7ed3\u679c. (\u5176\u4e2d NWORDS \u4e2d\u5355\u8bcd\u51fa\u73b0\u6b21\u6570\u5728\u62ec\u53f7\u4e2d), \u7136\u540e\u662f\u6211\u4eec\u671f\u671b\u7684\u8f93\u51fa\u4ee5\u53ca\u51fa\u73b0\u7684\u6b21\u6570. \u8fd9\u4e2a\u7ed3\u679c\u544a\u8bc9\u6211\u4eec, \u5982\u679c\u7a0b\u5e8f\u6839\u672c\u5c31\u4e0d\u77e5\u9053 &#8216;econometric&#8217; \u662f\u4e00\u4e2a\u5355\u8bcd, \u5b83\u4e5f\u5c31\u4e0d\u53ef\u80fd\u53bb\u628a &#8216;economtric&#8217; \u7ea0\u6b63\u6210 &#8216;econometric&#8217;. \u8fd9\u4e2a\u95ee\u9898\u53ef\u4ee5\u901a\u8fc7\u5f80\u8bad\u7ec3\u96c6\u5408\u4e2d\u52a0\u5165\u66f4\u591a\u8bed\u6599\u6765\u89e3\u51b3, \u4e0d\u8fc7\u4e5f\u6709\u53ef\u80fd\u5f15\u5165\u66f4\u591a\u9519\u8bef. \u540c\u65f6\u6ce8\u610f\u5230\u6700\u540e\u56db\u884c, \u5b9e\u9645\u4e0a\u6211\u4eec\u7684\u8bad\u7ec3\u96c6\u4e2d\u6709\u6b63\u786e\u7684\u5355\u8bcd, \u53ea\u662f\u5f62\u5f0f\u7565\u6709\u4e0d\u540c. \u56e0\u6b64, \u6211\u4eec\u53ef\u4ee5\u6539\u8fdb\u4e00\u4e0b\u7a0b\u5e8f, \u6bd4\u5982\u5728\u52a8\u8bcd\u540e\u9762\u52a0 &#8216;-ed&#8217; \u6216\u8005\u5728\u540d\u8bcd\u540e\u9762\u52a0 &#8216;-s&#8217; \u4e5f\u662f\u5408\u6cd5\u7684.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u7b2c\u4e8c\u4e2a\u53ef\u80fd\u5bfc\u81f4\u9519\u8bef\u7684\u56e0\u7d20\u662f\u6982\u7387: \u4e24\u4e2a\u8bcd\u90fd\u51fa\u73b0\u5728\u6211\u4eec\u7684\u5b57\u5178\u91cc\u9762\u4e86, \u4f46\u662f\u6070\u6070\u6211\u4eec\u9009\u7684\u6982\u7387\u5927\u7684\u90a3\u4e2a\u4e0d\u662f\u7528\u6237\u60f3\u8981\u7684. \u4e0d\u8fc7\u6211\u8981\u8bf4\u7684\u662f\u8fd9\u4e2a\u95ee\u9898\u5176\u5b9e\u4e0d\u662f\u6700\u4e25\u91cd\u7684, \u4e5f\u4e0d\u662f\u72ec\u7acb\u53d1\u751f\u7684, \u5176\u4ed6\u539f\u56e0\u53ef\u80fd\u66f4\u52a0\u4e25\u91cd.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u6211\u4eec\u53ef\u4ee5\u6a21\u62df\u4e00\u4e0b\u770b\u770b\u5982\u679c\u6211\u4eec\u63d0\u9ad8\u8bed\u8a00\u6a21\u578b, \u6700\u540e\u7ed3\u679c\u80fd\u597d\u591a\u5c11. \u6bd4\u5982\u8bf4, \u6211\u4eec\u5728\u8bad\u7ec3\u96c6\u4e0a\u5c0f&#8221;\u4f5c\u5f0a&#8221;\u4e00\u4e0b. \u6211\u4eec\u5728 spelltest \u51fd\u6570\u4e2d\u6709\u4e00\u4e2a\u53c2\u6570\u53eb\u505a bias, \u5b9e\u9645\u4e0a\u5c31\u662f\u4ee3\u8868\u628a\u6b63\u786e\u7684\u62fc\u5199\u8bcd\u591a\u6dfb\u52a0\u51e0\u6b21, \u4ee5\u4fbf\u63d0\u9ad8\u8bed\u8a00\u6a21\u578b\u4e2d\u76f8\u5e94\u7684\u6982\u7387. \u6bd4\u5982\u8bf4, \u5728\u8bed\u6599\u4e2d, \u5047\u8bbe\u6b63\u786e\u7684\u8bcd\u51fa\u73b0\u7684\u9891\u7387\u591a\u4e861\u6b21, \u6216\u800510\u6b21, \u6216\u8005\u66f4\u591a. \u5982\u679c\u6211\u4eec\u589e\u52a0 bias \u8fd9\u4e2a\u53c2\u6570\u7684\u503c, \u53ef\u4ee5\u770b\u5230\u8bad\u7ec3\u96c6\u548c\u6d4b\u8bd5\u96c6\u4e0a\u7684\u51c6\u786e\u7387\u90fd\u663e\u8457\u63d0\u9ad8\u4e86.<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nBias\t \u8bad\u7ec3\u96c6.\t \u6d4b\u8bd5\u96c6\r\n0\t 74%\t 67%\r\n1\t 74%\t 70%\r\n10\t 76%\t 73%\r\n100\t 82%\t 77%\r\n1000\t 89%\t 80%\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u5728\u4e24\u4e2a\u96c6\u5408\u4e0a\u6211\u4eec\u90fd\u80fd\u505a\u5230\u5927\u7ea6 80-90%. \u8fd9\u4e2a\u663e\u793a\u51fa\u5982\u679c\u6211\u4eec\u6709\u4e00\u4e2a\u597d\u7684\u8bed\u8a00\u6a21\u578b, \u6211\u4eec\u6216\u80fd\u8fbe\u5230\u51c6\u786e\u7387\u8fd9\u4e2a\u76ee\u6807. \u4e0d\u8fc7, \u8fd9\u4e2a\u663e\u5f97\u8fc7\u4e8e\u4e50\u89c2\u4e86, \u56e0\u4e3a\u6784\u5efa\u4e00\u4e2a\u66f4\u5927\u7684\u8bed\u8a00\u6a21\u578b\u4f1a\u5f15\u5165\u65b0\u7684\u8bcd, \u4ece\u800c\u53ef\u80fd\u8fd8\u4f1a\u5f15\u5165\u4e00\u4e9b\u9519\u8bef\u7ed3\u679c, \u5c3d\u7ba1\u8fd9\u4e2a\u5730\u65b9\u6211\u4eec\u6ca1\u89c2\u5bdf\u5230\u8fd9\u4e2a\u73b0\u8c61.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u5904\u7406\u672a\u77e5\u8bcd\u8fd8\u6709\u53e6\u5916\u4e00\u79cd\u529e\u6cd5, \u6bd4\u5982\u8bf4, \u5047\u5982\u9047\u5230\u8fd9\u4e2a\u8bcd: &#8220;electroencephalographicallz&#8221;, \u6bd4\u8f83\u597d\u7684\u7ea0\u6b63\u7684\u65b9\u6cd5\u662f\u628a\u6700\u540e\u7684 &#8220;z&#8221; \u53d8\u6210 &#8220;y&#8221;, \u56e0\u4e3a &#8216;-cally&#8217; \u662f\u82f1\u6587\u4e2d\u5f88\u5e38\u89c1\u7684\u4e00\u4e2a\u540e\u7f00. \u867d\u7136 &#8220;electroencephalographically&#8221; \u8fd9\u4e2a\u8bcd\u4e5f\u4e0d\u5728\u6211\u4eec\u7684\u5b57\u5178\u4e2d, \u6211\u4eec\u4e5f\u80fd\u901a\u8fc7\u57fa\u4e8e\u97f3\u8282\u6216\u8005\u524d\u7f00\u540e\u7f00\u7b49\u6027\u8d28\u7ed9\u51fa\u62fc\u5199\u5efa\u8bae. \u5f53\u7136, \u8fd9\u79cd\u7b80\u5355\u524d\u540e\u7f00\u5224\u65ad\u7684\u65b9\u6cd5\u6bd4\u57fa\u4e8e\u6784\u8bcd\u6cd5\u7684\u8981\u7b80\u5355\u7684\u591a.<\/span><\/p>\n<p><span style=\"font-size: small;\">P(w|c) \u662f\u8bef\u5dee\u6a21\u578b. \u5230\u76ee\u524d\u4e3a\u6b62, \u6211\u4eec\u90fd\u662f\u7528\u7684\u4e00\u4e2a\u5f88\u7b80\u964b\u7684\u6a21\u578b: \u8ddd\u79bb\u8d8a\u77ed, \u6982\u7387\u8d8a\u5927. \u8fd9\u4e2a\u4e5f\u9020\u6210\u4e86\u4e00\u4e9b\u95ee\u9898, \u6bd4\u5982\u4e0b\u9762\u7684\u4f8b\u5b50\u4e2d, correct \u51fd\u6570\u8fd4\u56de\u4e86\u7f16\u8f91\u8ddd\u79bb\u4e3a 1 \u7684\u8bcd\u4f5c\u4e3a\u7b54\u6848, \u800c\u6b63\u786e\u7b54\u6848\u6070\u6070\u7f16\u8f91\u8ddd\u79bb\u662f 2:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('reciet') =&gt; 'recite' (5); expected 'receipt' (14)\r\ncorrect('adres') =&gt; 'acres' (37); expected 'address' (77)\r\ncorrect('rember') =&gt; 'member' (51); expected 'remember' (162)\r\ncorrect('juse') =&gt; 'just' (768); expected 'juice' (6)\r\ncorrect('accesing') =&gt; 'acceding' (2); expected 'assessing' (1)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u4e3e\u4e2a\u4f8b\u5b50, \u7a0b\u5e8f\u8ba4\u4e3a\u5728 &#8216;adres&#8217; \u4e2d\u628a &#8216;d&#8217; \u53d8\u6210 &#8216;c&#8217; \u4ece\u800c\u5f97\u5230 &#8216;acres&#8217; \u7684\u4f18\u5148\u7ea7\u6bd4\u628a d \u5199\u6210 dd \u4ee5\u53ca s \u5199\u6210 ss \u7684\u4f18\u5148\u7ea7\u9ad8, \u4ece\u800c\u4f5c\u51fa\u4e86\u9519\u8bef\u7684\u5224\u65ad. \u8fd8\u6709\u4e9b\u65f6\u5019\u7a0b\u5e8f\u5728\u4e24\u4e2a\u7f16\u8f91\u8ddd\u79bb\u4e00\u6837\u7684\u5019\u9009\u8bcd\u4e2d\u9009\u62e9\u4e86\u9519\u8bef\u7684\u4e00\u4e2a, \u6bd4\u5982:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('thay') =&gt; 'that' (12513); expected 'they' (4939)\r\ncorrect('cleark') =&gt; 'clear' (234); expected 'clerk' (26)\r\ncorrect('wer') =&gt; 'her' (5285); expected 'were' (4290)\r\ncorrect('bonas') =&gt; 'bones' (263); expected 'bonus' (3)\r\ncorrect('plesent') =&gt; 'present' (330); expected 'pleasant' (97)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u8fd9\u4e2a\u4f8b\u5b50\u7ed9\u6211\u4eec\u4e00\u4e2a\u540c\u6837\u7684\u6559\u8bad: \u5728 &#8216;thay&#8217; \u4e2d, \u628a &#8216;a&#8217; \u53d8\u6210 &#8216;e&#8217; \u7684\u6982\u7387\u6bd4\u628a &#8216;y&#8217; \u62fc\u6210 &#8216;t&#8217; \u5927. \u4e3a\u4e86\u6b63\u786e\u7684\u9009\u62e9 &#8216;they&#8217;, \u6211\u4eec\u81f3\u5c11\u8981\u5728\u5148\u9a8c\u6982\u7387\u4e0a\u4e58\u4ee5 2.5, \u624d\u80fd\u4f7f\u5f97\u6700\u540e they \u7684\u51e0\u7387\u8d85\u8fc7 that, \u4ece\u800c\u9009\u62e9 they.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u663e\u7136, \u6211\u4eec\u53ef\u4ee5\u7528\u4e00\u4e2a\u66f4\u597d\u7684\u6a21\u578b\u6765\u8861\u91cf\u62fc\u9519\u5355\u8bcd\u7684\u6982\u7387. \u6bd4\u5982\u8bf4, \u628a\u4e00\u4e2a\u5b57\u6bcd\u987a\u624b\u6253\u6210\u4e24\u4e2a, \u6216\u8005\u628a\u4e00\u4e2a\u5143\u97f3\u6253\u6210\u53e6\u4e00\u4e2a\u7684\u60c5\u51b5\u90fd\u5e94\u8be5\u6bd4\u5176\u4ed6\u6253\u5b57\u9519\u8bef\u66f4\u52a0\u5bb9\u6613\u53d1\u751f. \u5f53\u7136, \u66f4\u597d\u7684\u529e\u6cd5\u8fd8\u662f\u4ece\u6570\u636e\u5165\u624b: \u6bd4\u5982\u8bf4, \u627e\u4e00\u4e2a\u62fc\u5199\u9519\u8bef\u8bed\u6599, \u7136\u540e\u7edf\u8ba1\u63d2\u5165; \u5220\u9664; \u4ea4\u6362\u548c\u53d8\u6362\u5728\u7ed9\u5b9a\u5468\u56f4\u5b57\u6bcd\u60c5\u51b5\u4e0b\u7684\u6982\u7387. \u4e3a\u4e86\u91c7\u96c6\u5230\u8fd9\u4e9b\u6982\u7387, \u53ef\u80fd\u6211\u4eec\u9700\u8981\u975e\u5e38\u5927\u7684\u6570\u636e\u96c6. \u6bd4\u5982\u8bf4, \u5982\u679c\u6211\u4eec\u5e26\u7740\u89c2\u5bdf\u5de6\u53f3\u4e24\u4e2a\u5b57\u6bcd\u4f5c\u4e3a\u4e0a\u4e0b\u6587, \u770b\u770b\u4e00\u4e2a\u5b57\u6bcd\u66ff\u6362\u6210\u53e6\u4e00\u4e2a\u7684\u6982\u7387, \u5c31\u4e00\u5171\u6709 266 \u79cd\u60c5\u51b5, \u4e5f\u5c31\u662f\u5927\u7ea6\u8d85\u8fc7 3 \u4ebf\u4e2a\u60c5\u51b5. \u7136\u540e\u6bcf\u79cd\u60c5\u51b5\u9700\u8981\u5e73\u5747\u51e0\u4e2a\u8bc1\u636e\u4f5c\u4e3a\u652f\u6491, \u56e0\u6b64\u6211\u4eec\u77e5\u905310\u4ebf\u4e2a\u5b57\u6bcd\u7684\u8bad\u7ec3\u96c6. \u5982\u679c\u4e3a\u4e86\u4fdd\u8bc1\u66f4\u597d\u7684\u8d28\u91cf, \u53ef\u80fd\u81f3\u5c11100\u4ebf\u4e2a\u624d\u5dee\u4e0d\u591a.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u9700\u8981\u6ce8\u610f\u7684\u662f, \u8bed\u8a00\u6a21\u578b\u548c\u8bef\u5dee\u6a21\u578b\u4e4b\u95f4\u662f\u6709\u8054\u7cfb\u7684. \u6211\u4eec\u7684\u7a0b\u5e8f\u4e2d\u5047\u8bbe\u4e86\u7f16\u8f91\u8ddd\u79bb\u4e3a 1 \u7684\u4f18\u5148\u4e8e\u7f16\u8f91\u8ddd\u79bb\u4e3a 2 \u7684. \u8fd9\u79cd\u8bef\u5dee\u6a21\u578b\u6216\u591a\u6216\u5c11\u4e5f\u4f7f\u5f97\u8bed\u8a00\u6a21\u578b\u7684\u4f18\u70b9\u96be\u4ee5\u53d1\u6325. \u6211\u4eec\u4e4b\u6240\u4ee5\u6ca1\u6709\u5f80\u8bed\u8a00\u6a21\u578b\u4e2d\u52a0\u5165\u5f88\u591a\u4e0d\u5e38\u7528\u7684\u5355\u8bcd, \u662f\u56e0\u4e3a\u6211\u4eec\u62c5\u5fc3\u6dfb\u52a0\u8fd9\u4e9b\u5355\u8bcd\u540e, \u4ed6\u4eec\u6070\u597d\u548c\u6211\u4eec\u8981\u66f4\u6b63\u7684\u8bcd\u7f16\u8f91\u8ddd\u79bb\u662f1, \u4ece\u800c\u90a3\u4e9b\u51fa\u73b0\u9891\u7387\u66f4\u9ad8\u4f46\u662f\u7f16\u8f91\u8ddd\u79bb\u4e3a 2 \u7684\u5355\u8bcd\u5c31\u4e0d\u53ef\u80fd\u88ab\u9009\u4e2d\u4e86. \u5982\u679c\u6709\u4e00\u4e2a\u66f4\u52a0\u597d\u7684\u8bef\u5dee\u6a21\u578b, \u6216\u8bb8\u6211\u4eec\u5c31\u80fd\u591f\u653e\u5fc3\u5927\u80c6\u7684\u6dfb\u52a0\u66f4\u591a\u7684\u4e0d\u5e38\u7528\u5355\u8bcd\u4e86. \u4e0b\u9762\u5c31\u662f\u4e00\u4e2a\u56e0\u4e3a\u6dfb\u52a0\u4e0d\u5e38\u7528\u5355\u8bcd\u5f71\u54cd\u7ed3\u679c\u7684\u4f8b\u5b50:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('wonted') =&gt; 'wonted' (2); expected 'wanted' (214)\r\ncorrect('planed') =&gt; 'planed' (2); expected 'planned' (16)\r\ncorrect('forth') =&gt; 'forth' (83); expected 'fourth' (79)\r\ncorrect('et') =&gt; 'et' (20); expected 'set' (325)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u679a\u4e3e\u6240\u6709\u53ef\u80fd\u7684\u6982\u7387\u5e76\u4e14\u9009\u62e9\u6700\u5927\u7684: argmaxc. \u6211\u4eec\u7684\u7a0b\u5e8f\u679a\u4e3e\u4e86\u76f4\u5230\u7f16\u8f91\u8ddd\u79bb\u4e3a2\u7684\u6240\u6709\u5355\u8bcd. \u5728\u6d4b\u8bd5\u96c6\u5408\u4e2d, 270\u4e2a\u5355\u8bcd\u4e2d, \u53ea\u67093\u4e2a\u7f16\u8f91\u8ddd\u79bb\u5927\u4e8e2, \u4f46\u662f\u5728\u6d4b\u8bd5\u96c6\u5408\u4e2d, 400\u4e2a\u4e2d\u5374\u670923\u4e2a. \u4ed6\u4eec\u662f:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\npurple perpul\r\ncurtains courtens\r\nminutes muinets\r\n\r\nsuccessful sucssuful\r\nhierarchy heiarky\r\nprofession preffeson\r\nweighted wagted\r\ninefficient ineffiect\r\navailability avaiblity\r\nthermawear thermawhere\r\nnature natior\r\ndissension desention\r\nunnecessarily unessasarily\r\ndisappointing dissapoiting\r\nacquaintances aquantences\r\nthoughts thorts\r\ncriticism citisum\r\nimmediately imidatly\r\nnecessary necasery\r\nnecessary nessasary\r\nnecessary nessisary\r\nunnecessary unessessay\r\nnight nite\r\nminutes muiuets\r\nassessing accesing\r\nnecessitates nessisitates\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u6211\u4eec\u53ef\u4ee5\u8003\u8651\u6709\u9650\u7684\u5141\u8bb8\u4e00\u4e9b\u7f16\u8f91\u8ddd\u79bb\u4e3a3\u7684\u60c5\u51b5. \u6bd4\u5982\u8bf4, \u6211\u4eec\u53ef\u4ee5\u53ea\u5141\u8bb8\u5728\u5143\u97f3\u65c1\u8fb9\u63d2\u5165\u4e00\u4e2a\u5143\u97f3, \u6216\u8005\u628a\u5143\u97f3\u66ff\u6362, \u6216\u8005\u628a c \u5199\u6210 s \u7b49\u7b49. \u8fd9\u4e9b\u57fa\u672c\u4e0a\u5c31\u8986\u76d6\u4e86\u4e0a\u9762\u6240\u6709\u7684\u60c5\u51b5\u4e86.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u7b2c\u56db\u79cd, \u4e5f\u662f\u6700\u597d\u7684\u4e00\u79cd\u6539\u8fdb\u65b9\u6cd5\u662f\u6539\u8fdb correct \u51fd\u6570\u7684\u63a5\u53e3, \u8ba9\u4ed6\u53ef\u4ee5\u5206\u6790\u4e0a\u4e0b\u6587\u7ed9\u51fa\u51b3\u65ad. \u56e0\u4e3a\u5f88\u591a\u60c5\u51b5\u4e0b, \u4ec5\u4ec5\u6839\u636e\u5355\u8bcd\u672c\u8eab\u505a\u51b3\u65ad\u5f88\u96be, \u8fd9\u4e2a\u5355\u8bcd\u672c\u8eab\u5c31\u5728\u5b57\u5178\u4e2d, \u4f46\u662f\u5728\u4e0a\u4e0b\u6587\u4e2d, \u5e94\u8be5\u88ab\u66f4\u6b63\u4e3a\u53e6\u4e00\u4e2a\u5355\u8bcd. \u6bd4\u5982\u8bf4:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('where') =&gt; 'where' (123); expected 'were' (452)\r\ncorrect('latter') =&gt; 'latter' (11); expected 'later' (116)\r\ncorrect('advice') =&gt; 'advice' (64); expected 'advise' (20)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u5982\u679c\u5355\u770b &#8216;where&#8217; \u8fd9\u4e2a\u5355\u8bcd\u672c\u8eab, \u6211\u4eec\u65e0\u4ece\u77e5\u6653\u8bf4\u4ec0\u4e48\u60c5\u51b5\u4e0b\u8be5\u628a correct(&#8216;where&#8217;) \u8fd4\u56de &#8216;were&#8217; , \u53c8\u5728\u4ec0\u4e48\u60c5\u51b5\u4e0b\u8fd4\u56de &#8216;where&#8217;. \u4f46\u662f\u5982\u679c\u6211\u4eec\u7ed9 correct \u51fd\u6570\u7684\u662f:&#8217;They where going&#8217;, \u8fd9\u65f6\u5019 &#8220;where&#8221; \u5c31\u5e94\u8be5\u88ab\u66f4\u6b63\u4e3a &#8220;were&#8221;.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u4e0a\u4e0b\u6587\u53ef\u4ee5\u5e2e\u52a9\u7a0b\u5e8f\u4ece\u591a\u4e2a\u5019\u9009\u7b54\u6848\u4e2d\u9009\u51fa\u6700\u597d\u7684, \u6bd4\u5982\u8bf4:<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('hown') =&gt; 'how' (1316); expected 'shown' (114)\r\ncorrect('ther') =&gt; 'the' (81031); expected 'their' (3956)\r\ncorrect('quies') =&gt; 'quiet' (119); expected 'queries' (1)\r\ncorrect('natior') =&gt; 'nation' (170); expected 'nature' (171)\r\ncorrect('thear') =&gt; 'their' (3956); expected 'there' (4973)\r\ncorrect('carrers') =&gt; 'carriers' (7); expected 'careers' (2)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u4e3a\u4ec0\u4e48 &#8216;thear&#8217; \u8981\u88ab\u66f4\u6b63\u4e3a &#8216;there&#8217; \u800c\u4e0d\u662f &#8216;their&#8217; \u5462? \u53ea\u770b\u5355\u8bcd\u672c\u8eab, \u8fd9\u4e2a\u95ee\u9898\u4e0d\u597d\u56de\u7b54, \u4e0d\u8fc7\u4e00\u65e6\u653e\u53e5\u5b50 &#8216;There&#8217;s no there thear&#8217; \u4e2d, \u7b54\u6848\u5c31\u7acb\u5373\u6e05\u695a\u660e\u4e86\u4e86.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u8981\u6784\u5efa\u4e00\u4e2a\u540c\u65f6\u80fd\u5904\u7406\u591a\u4e2a\u8bcd(\u8bcd\u4ee5\u53ca\u4e0a\u4e0b\u6587)\u7684\u7cfb\u7edf, \u6211\u4eec\u9700\u8981\u5927\u91cf\u7684\u6570\u636e. \u6240\u5e78\u7684\u662f Google \u5df2\u7ecf\u516c\u5f00\u53d1\u5e03\u4e86\u6700\u957f 5\u4e2a\u5355\u8bcd\u7684\u6240\u6709\u5e8f\u5217\u6570 \u636e\u5e93, \u8fd9\u4e2a\u662f\u4ece\u4e0a\u5343\u4ebf\u4e2a\u8bcd\u7684\u8bed\u6599\u6570\u636e\u4e2d\u6536\u96c6\u5f97\u5230\u7684. \u6211\u76f8\u4fe1\u4e00\u4e2a\u80fd\u8fbe\u5230 90% \u51c6\u786e\u7387\u7684\u62fc\u5199\u68c0\u67e5\u5668\u5df2\u7ecf\u9700\u8981\u8003\u8651\u4e0a\u4e0b\u6587\u4ee5\u505a\u51b3\u5b9a\u4e86. \u4e0d\u8fc7, \u8fd9\u4e2a, \u54b1\u4eec\u6539\u5929\u8ba8\u8bba :)<\/span><\/p>\n<p><span style=\"font-size: small;\">\u6211\u4eec\u53ef\u4ee5\u901a\u8fc7\u4f18\u5316\u8bad\u7ec3\u6570\u636e\u548c\u6d4b\u8bd5\u6570\u636e\u6765\u63d0\u9ad8\u51c6\u786e\u7387. \u6211\u4eec\u6293\u53d6\u4e86\u5927\u7ea6100\u4e07\u4e2a\u5355\u8bcd\u5e76\u4e14\u5047\u8bbe\u8fd9\u4e9b\u8bcd\u90fd\u662f\u62fc\u5199\u6b63\u786e\u7684. \u4f46\u662f\u8fd9\u4e2a\u4e8b\u60c5\u5e76\u4e0d\u8fd9\u4e48\u5b8c\u7f8e, \u8fd9\u4e9b\u6570\u636e\u96c6\u4e5f\u53ef\u80fd\u6709\u9519. \u6211\u4eec\u53ef\u4ee5\u5c1d\u8bd5\u8fd9\u627e\u51fa\u8fd9\u4e9b\u9519\u5e76\u4e14\u4fee\u6b63\u4ed6\u4eec. \u8fd9\u4e2a\u5730\u65b9, \u4fee\u6b63\u6d4b\u8bd5\u96c6\u5408\u5e76\u4e0d\u56f0\u96be. \u6211\u7559\u610f\u5230\u81f3\u5c11\u6709\u4e09\u79cd\u60c5\u51b5\u4e0b, \u6d4b\u8bd5\u96c6\u5408\u8bf4\u6211\u4eec\u7684\u7a0b\u5e8f\u7ed9\u51fa\u4e86\u9519\u8bef\u7684\u7b54\u6848, \u800c\u6211\u5374\u8ba4\u4e3a\u6211\u4eec\u7a0b\u5e8f\u7684\u7b54\u6848\u6bd4\u6d4b\u8bd5\u96c6\u7ed9\u7684\u7b54\u6848\u8981\u597d, \u6bd4\u5982\u8bf4: (\u5b9e\u9645\u4e0a\u6d4b\u8bd5\u96c6\u7ed9\u7684\u4e09\u4e2a\u7b54\u6848\u7684\u62fc\u5199\u90fd\u4e0d\u6b63\u786e)<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('aranging') =&gt; 'arranging' (20); expected 'arrangeing' (1)\r\ncorrect('sumarys') =&gt; 'summary' (17); expected 'summarys' (1)\r\ncorrect('aurgument') =&gt; 'argument' (33); expected 'auguments' (1)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u6211\u4eec\u8fd8\u53ef\u4ee5\u51b3\u5b9a\u82f1\u8bed\u7684\u53d8\u79cd, \u4ee5\u4fbf\u8bad\u7ec3\u6211\u4eec\u7684\u7a0b\u5e8f, \u6bd4\u5982\u8bf4\u4e0b\u9762\u7684\u4e09\u4e2a\u9519\u8bef\u662f\u56e0\u4e3a\u7f8e\u5f0f\u82f1\u8bed\u548c\u82f1\u5f0f\u82f1\u8bed\u62fc\u53d1\u4e0d\u4e00\u6837\u9020\u6210\u7684, (\u6211\u4eec\u7684\u8bad\u7ec3\u96c6\u4e24\u8005\u90fd\u6709):<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\ncorrect('humor') =&gt; 'humor' (17); expected 'humour' (5)\r\ncorrect('oranisation') =&gt; 'organisation' (8); expected 'organization' (43)\r\ncorrect('oranised') =&gt; 'organised' (11); expected 'organized' (70)\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u6700\u540e\u7684\u4e00\u4e2a\u6539\u8fdb\u662f\u8ba9\u7a0b\u5e8f\u8fd0\u884c\u5f97\u66f4\u52a0\u5feb\u4e00\u70b9. \u6bd4\u5982\u8bf4, \u6211\u4eec\u7528\u7f16\u8bd1\u8bed\u8a00\u6765\u5199, \u800c\u4e0d\u662f\u7528\u89e3\u91ca\u8bed\u8a00. \u6211\u4eec\u53ef\u4ee5\u4f7f\u7528\u67e5\u627e\u8868, \u800c\u4e0d\u7528Python\u63d0\u4f9b\u7684\u901a\u7528\u7684 dict \u5bf9\u8c61, \u6211\u4eec\u53ef\u4ee5\u7f13\u5b58\u8ba1\u7b97\u7ed3\u679c, \u4ece\u800c\u907f\u514d\u91cd\u590d\u8ba1\u7b97, \u7b49\u7b49. \u4e00\u4e2a\u5c0f\u5efa\u8bae\u662f: \u5728\u505a\u4efb\u4f55\u901f\u5ea6\u4f18\u5316\u4e4b\u524d, \u5148\u5f04\u6e05\u695a\u5230\u5e95\u7a0b\u5e8f\u7684\u65f6\u95f4\u82b1\u5728\u4ec0\u4e48\u5730\u65b9\u4e86.<\/span><\/p>\n<p><span style=\"font-size: small;\">\u539f\u6587\uff1ahttp:\/\/blog.youxu.info\/spell-correct.html<\/span><\/p>\n<p><span style=\"font-size: small; color: #ff0000;\">python\u7684\u62fc\u5199\u68c0\u67e5\u5e93pyenchant<\/span><\/p>\n<p><span style=\"font-size: small;\">\u76f4\u63a5\u7528 pip install pyenchant \u5c31\u53ef\u4ee5\u5b89\u88c5\u3002<\/span><\/p>\n<p><span style=\"font-size: small;\">\u4f7f\u7528\uff1a<\/span><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nimport enchant\r\nd = enchant.Dict('en_US')\r\nd.check('python')\r\nTrue\r\nd.suggest('pytho')\r\n&#x5B;'python', 'Python', 'Pythias', 'Pythagoras', 'Pythagorean', 'pathos', 'paths']\r\n<\/pre>\n<p><span style=\"font-size: small;\">\u6559\u7a0b\uff1ahttp:\/\/pythonhosted.org\/pyenchant\/tutorial.html<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u4e0a\u4e2a\u661f\u671f, \u6211\u7684\u4e24\u4e2a\u670b\u53cb Dean \u548c Bill \u5206\u522b\u544a\u8bc9\u6211\u8bf4\u4ed6\u4eec\u5bf9 Google \u7684\u5feb\u901f\u9ad8\u8d28\u91cf\u7684\u62fc\u5199\u68c0\u67e5\u5de5\u5177 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-2738","post","type-post","status-publish","format-standard","hentry","category-skill"],"_links":{"self":[{"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/posts\/2738","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/comments?post=2738"}],"version-history":[{"count":4,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/posts\/2738\/revisions"}],"predecessor-version":[{"id":2740,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/posts\/2738\/revisions\/2740"}],"wp:attachment":[{"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/media?parent=2738"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/categories?post=2738"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kyle.ai\/blog\/wp-json\/wp\/v2\/tags?post=2738"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}