英语老师必读的英文原版_如何找到英语老师。 第2部分
英语老师必读的英文原版
This is a continuation of story about using Data Science for finding an English teacher. If you have not read it yet -
这是关于使用数据科学寻找英语老师的故事的延续。 如果您尚未阅读- there is an opportunity to become familiar with it 有机会熟悉它
Briefly - we had information about language teachers and tried to apply some basic ideas using pandas and our expectations. Unfortunately we got stuck on the third step, because there is not enough information for resolving our the last requirements - we need not more 3 candidates at the end.
简要地说-我们获得了有关语言老师的信息,并尝试通过熊猫和我们的期望运用一些基本思想。 不幸的是,我们停留在第三步,因为没有足够的信息来解决我们的最后一个要求-最后我们不需要3名候选人。
Disclaimer
免责声明
步骤4.人群的智慧 ( Step 4. Wisdom of crowd)
Well, it is a bit harder than expected…
好吧,这比预期的要难一点……
Despite the fact of having some information about teachers - we could not blindly believe in it. In the previous step, we made judgments/conclusions about the experience from a text description. And it makes sense when we consider something which can be verified(more or less).
尽管有一些关于老师的信息,但我们不能盲目地相信它。 在上一步中,我们根据文字描述对体验进行了判断/结论。 当我们考虑可以验证的东西时(或多或少),这是有道理的。
But if we want to choose «TOP 3» from 7 teachers, and we only stick with the description which has been provided - mistakes are bound to happen. The reason can be illustrated the famous saying
但是,如果我们想从7位老师中选择“ TOP 3”,并且我们只坚持提供的说明-错误肯定会发生。 原因可以说明著名的谚语 «Every cook praises his own broth». And there is a remote chance that people will say negative things about themselves. So that we need to add some related information like reviews. Firstly, we fetch them.“每个厨师都赞美自己的汤” 。 人们极有可能对自己说负面的话。 因此,我们需要添加一些相关信息,例如评论。 首先,我们获取它们。
上课数量 (Amount of lessons)
The first important moment - how many lessons did students have with a particular teacher? They could write amazing reviews, but will not book classes any more. I presume it is a very correlated moment with the experience of a teacher. Let's have a look at it.
第一个重要时刻-学生与特定老师一起上了多少课? 他们可以写出惊人的评论,但不会再预定课程。 我认为这是与老师的经历非常相关的时刻。 让我们来看看它。
According to an average number of lessons, a person from South Africa has got more loyal students than other.
根据平均课程数量,来自南非的人比其他人拥有更多的忠诚学生。
我们需要更深入… ( We need to go deeper…)
On the one hand, people like to write a review and language learners as well. On the other hand - these reviews very frequently are simple and look like «A good teacher», «Thank you», «The best lesson», etc.
一方面,人们也喜欢写评论和语言学习者。 另一方面,这些评论通常很简单,看起来像是“好老师”,“谢谢”,“最好的课程”等。
And… to be honest - the majority of reviews are similar to each other and are not seductive for us. We can try to classify it(sentiment analysis) but it still would be more about emotions than facts. Let's look at this from another perspective.
而且……说实话-大多数评论彼此相似,对我们而言并不诱人。 我们可以尝试对它进行分类(情感分析),但它仍然更多是关于情感而不是事实。 让我们从另一个角度来看这件事。
Instead of reading excited(or disappointed) reviews - we try to understand the structure. Before we worked with text by using regular expressions. Shall we use something more powerful for reaching our goal?
与其阅读激动(或失望)的评论,不如说是尝试了解结构。 在使用正则表达式处理文本之前。 我们是否应该使用更强大的功能来达成目标?
Perhaps… some NLP will harm nobody.
也许……某些NLP不会伤害任何人。
I believe that a good review ought to contain a variety of lexicon and vocabulary. Things like «Present/Past perfect», «Conditionals», «Past participle», «Future tense» are good signs, which point at the ability of students to express their opinion in an eloquent way.
我认为,一篇好的评论应该包含各种词典和词汇。 诸如“现在/过去完美”,“条件”,“过去分词”,“未来时态”之类的东西都是好兆头,表明学生有能力以雄辩的方式表达意见。
«Adapted complexity of speech» represents how students of a specific teacher could express their thoughts.
“适应的语言复杂性”表示特定老师的学生如何表达自己的想法。
In other words, we try to handle it in an unusual way, ignoring
换句话说,我们试图以一种不寻常的方式来处理它,而忽略了 «what exactly do people write?», or "“人们到底写了什么?” 或“ what do they feel about a lesson". 他们对一堂课有什么感觉” 。
Instead of it, let's try to understand
取而代之的是,让我们尝试理解«HOW their speech was expressed» and to find a teacher who has the most sophisticated reviews. “他们的言语表达方式”,并找到一位评论最复杂的老师。
And then, rearrange teachers by this principle.
然后,按照此原则重新安排教师。
第5步。 (Step 5. Time for big guns.)
It is a well-known fact that students want to learn from the best teacher. However… I would rather learn from teachers who taught guys gained the B2 level or upper. And in my opinion, we could assess the level of students from their reviews.
学生要向最好的老师学习是一个众所周知的事实。 但是……我宁愿向那些教过B2级或更高水平的老师学习。 我认为,我们可以根据他们的评论来评估学生的水平。
There is an amazing dataset of information - EF-Cambridge Open Language Database. And the point is that it contains a huge set of combinations «an expression — a language proficiency level»EF-Cambridge Open Language Database拥有一个惊人的信息数据集。 关键是它包含大量的组合«表达-语言熟练程度»
About levels of language proficiency
关于语言水平
We will use an LSTM - network, which has been trained on this dataset (it is a completely different story...). The main idea behind it - «an automatic classification of English learner proficiency» on text.
我们将使用LSTM-网络,该网络已在此数据集上进行了训练(这是完全不同的故事...)。 其背后的主要思想-文本上的“英语学习者能力的自动分类”。
Let me show you an example, how it works with a piece from my last essay:
让我向您展示一个示例,该示例如何与上一篇文章中的内容一起使用:
Okay, let's try to do the same in real life.
好吧,让我们尝试在现实生活中做同样的事情。
And we tend to shift our score to students who have more classes with the specific teacher. Usually, people who booked/took many lessons would prefer to write extended reviews than people who had only one lesson.
而且我们倾向于将分数转移到与特定老师一起上更多课的学生。 通常,预订/上很多课的人比只上一堂课的人更喜欢写扩展的评论。
And after that look at result this approach:
然后看一下结果,这种方法:
步骤6.可视化+分析 (Step 6. Visualization+analysis)
It is time for the integration! We have the teacher who is dramatically better by «opinion» of the ML-model, but…instead of relying on only this metrics, we could combine different scores, set a rank for teachers, and then range them by this integral value. This idea very similar to «a ranked voting system».
现在是整合的时候了! 我们的老师比ML模型的“意见”要好得多,但是……我们不仅可以依靠此指标,还可以组合不同的分数,为老师设置等级,然后根据此整数值进行排列。 这个想法非常类似于“分级投票系统”。
Well… Teachers from an unknown country, South Africa and Spain are the most preferable according to our calculations.
好吧……根据我们的计算,来自未知国家,南非和西班牙的老师是最可取的。
So… we are ready to go back to our expectations and to nail the last one.
所以……我们准备回到我们的期望并确定最后一个。
Did we arrive at our finish point? Definitely!
我们到达终点了吗? 绝对!
结论 (Conclusion)
Particularly for me, it was a little sad, that the teacher from Russia was sixth in this improvised race. At the same time, it is only this specific contest, and in any contest will be leaders and outsiders.
特别是对我来说,让俄罗斯的老师在这场即兴比赛中名列第六,这让我有些难过。 同时,这只是特定的竞赛,在任何竞赛中都将是领导者和局外人。
The main thing I wanted to say - I hope these ideas and approaches would help you find a good online teacher using data science and ML.
我想说的主要事情-希望这些想法和方法能帮助您找到使用数据科学和ML的优秀在线老师。
At the same time, this research (or maybe «an investigation»), not only about looking for a language teacher. It is more about how data hidden behind a user-interface, could help us make a choice based not only on visible metrics but on something beneath it.
同时,这项研究(或可能是一项“调查”)不仅涉及寻找语言老师。 它更多地是关于隐藏在用户界面后面的数据如何帮助我们不仅基于可见指标,还基于其下面的内容进行选择。 P.S. Thank you for reading. PS谢谢您的阅读。 There is the final version of Ipython-notebook有 Ipython-notebook的最终版本
英语老师必读的英文原版