The Teaching Machine - How to automate learning

There have been many attempts to create an automatic teaching system that allows a student to learn without a human teacher. If possible, this would disrupt the education process, but arguably last great educational technology was the book. Derek Muller in his YouTube videos ‘The Most Persistent Myth’ and more recently ‘What Everyone Gets Wrong About AI and Learning’ has argued that the promise or films, video, MOOCs has not delivered a revolution.

Derek Muller provides compelling arguments both for and against the argument that AI will finally revolutionise learning. I have considered what steps would be required to allow a machine to effectively support learning in the way that a teacher does. I have drawn from a number of fields and made assumptions about the capabilities of technology to deliver the steps.

My expertise comes from working as a doctor where teaching is a key professional skill, writing books on how psychological techniques work and AI learning, studying teaching, amateur interest in systems analysis and a good understanding of mathematics. The following is a road map of how it might be possible to overcome the problems and limitations of current systems. My solutions to each step may fail but the road map lays out what will be required to achieve the outcome.

Knowledge graphs KGs

I started with the idea that the subject material might be stored as a knowledge graph. The structure would need to be more complex than the normal triplet subject – relation – object does not contain all the necessary information. The first addition was provenance (where does the information come from?). This appeared to me to be an essential part of the process because without knowing where the information had come from I would not know how to react to it. 

The next addition was domain (where does the triplet apply?) As I started to consider the problem of false information I realised that sometimes there will be two correct but inconsistent views of the same thing. It is therefore inherent that the truth of the information depends upon the domain. I used the example of general relativity and the standard model, both are correct but in different domains (gravity v the quantum level). 

The third addition is already part of knowledge system and that is confidence. In a knowledge graph confidence has a slightly different function compared to neural networks. I realised that there are patterns of confidence rather than an absolute value depending on the observer. These six pieces of information allow a better description of a relationship than can achieved by the initial triplet.  

Helpful response

The next step in the learning machine is to understand the response of the learner to information. This creates a chicken and egg type problem, how does the learning machine know what to offer if it does not know what the learner needs. I played with ideas such as using choice and attention to derive insights into the learner but realised that this would be impossible complex and unreliable. 

I next looked at emotional communication which has been both in the minds of AI experts and GPs. Both groups have made little progress but interestingly for different reasons. The AI experts have been trying to find the ‘helpful response’ by copying the professional ‘relevant to the subject, complete, factual and safe’. GPs have been able to provide helpful responses but not been able to define what makes them helpful. 

I recognised that the difference between the AI responses and my colleagues approach was that we focused on being proactive and creating a relationship. I am able to monitor my success by how interesting the interaction feels. From this insight I was able to breakdown the process into seven parts 1 individualised and 2 open to new ideas, 3 uses different types of interaction, 4 adjusts emotional tone, 5 recognises areas of uncertainty, 6 clarifies ambiguity and 7 generates options.  

Prediction

I have been impressed by the type of AI that generates predictions from data using a diffusion technique. They have been used to predict the weather with much less computational power than traditional methods. The AI does not need to understand the physics and only needs to be fed data to be able to predict the patterns. Although it has limited success there is one feature that stood out, the predictions can be used to create adversarial learning. 

The Teaching Machine needs to be able to make predictions and learn from those predictions to move away from human feedback. Any system that can predict what will be interesting to the person has great potential to capture attention. Systems that can more accurately recommend the next product have already transformed the internet space. Even a system based on limited data can be effective, the Teaching Machine would have two upgrades. 

The first is transforming the general observations that specific groups have special interests into an understanding of what a specific person is actually interested in. The helpful response is not limited to an algorithmic selection or the professional approach. The response can be emotionally intelligent as it is relational and proactive. By predicting the emotional response (interest and enjoyment) as well as attention length the Teaching Machine can choose a more effective strategy. 

The second upgrade is that the responses are more complex and detailed meaning that the Teaching Machine can see if the prediction was correct. The student will interact with the Teaching Machine like a student and teacher, discussing, questioning and making mistakes. This rich information makes it much easier to create a system that will capture the student's understanding and their emotional responses.

Digital model

The Teaching Machine needs to store this information about the student so that it can make more accurate predictions. The adapted knowledge graph from above appeared to be the best approach. The advantage of being able to compare the KG of the curriculum with the KG of the person suggested that this was the answer. The additional fields would help solve some of the problems to do with inconsistency. 

I considered how the student KG would work I came across some problems. The first was that the student will often have high certainty of the wrong or an incomplete answer. There was a risk that even if the Teaching Machine was emotionally intelligent that it would struggle with a student whose ideas were too distant from the data. Another problem was that any biases in the system of extraction might give a false assessment of the student’s understanding, it might correct their mistakes. 

I decided that the problem with KGs for this purpose was the quality of the data that they contain. There is a balance between having too many entities (different names for similar things) on the one hand and having too few so that it is difficult to distinguish between similar concepts. The KG does not have a solution to this problem, you trade accuracy for usability. 

I started by considering using few shot prompting with diverse examples and asking the AI to provide creative answers to storing that information. This brute force approach helped with expert support might increase the number of concepts that can be stored using the KG. However even if creative solutions were found there would be a residual of information that would require exponentially more entities to capture.

I thought about using a Property Graph PG to manage the difficult concepts as this would have the advantage of capturing several properties in a single entry. By sacrificing the simplicity of the KG the Property Graph PG would be able to store a more accurate version of the student’s response. I then realised that with further learning the Teaching Machine would be able to resolve many of the PG entries. Creating nonstandard ideas is part of the learning process and is where attention should be focused by the Teaching Machine.

Further problems

Extracting information from a student’s response and to create a digital model of the student is likely to start as a primitive system. I am reminded of the research that showed that the information that could be extracted from a short piece of writing at age 11 was enough to predict their future pathway. The Centre for Longitudinal Studies. I would not recommend trying to get all the available information and limiting the extraction to that directly associated with knowledge of the subject. 

Research into the most effective ways of storing teaching materials in a KG will take practical experimentation. As these KGs will become the basis of the Teaching Machine’s function they will need to be resistant to gaming. The risk of harm if an error is present in the KG means that any KG that is going to be used for mass education will ongoing monitoring as well as extensive testing for hallucinations.  

Emotional communication needs fully mapping which will take significant resources to complete. The seven parts are intuitive and finding solutions to each part would be difficult to automate. Once a map has been created machine learning could be used to identify errors and omissions based upon patterns in learning data from students. It is worth noting that duality is integral to emotional communication.  

LLMs are excellent at creating content and it is plausible that an LLM which has reinforcement learning from human feedback based upon the emotional communication processes will provide more compelling content. An alternative would be to assess previously created material using the emotional communication and using that instead. I dismissed this idea initially as it has obvious disadvantages. 

Prediction will improve with time as the predictions are compared with the student’s responses. There are ethical issues with the use of these prediction models and the confidentiality of the student’s responses. It is unlikely that a single commercial organisation should be allowed to control this data. Currently there are organisations who have similarly sensitive information about shopping habits or internet searches and emails which could be used in the same way. 

The technology for turning free text into KGs even with Property Graphs to capture the more complex data will likely have accuracy and reliability issues. The current level of technology may be sufficient to provide insight into the student’s understanding before it reaches its full potential. This may allay some of the ethical concerns as the potential of this limited information will be less threatening.

Conclusions

This road map suggests that there are parts of a Teaching Machine which are amenable to progress. Other aspects of the road map may have to wait until progress is made or the first Teaching Machines are created. The only concern is that LLMs may not be able to master emotional communication and Teaching Machines might have to depend upon human made content to be effective. 

There are two areas which may require further technological progress, the first is the prediction model. At least in principle there are many possible solutions such as simple decision tree and premade predictions, LLM generated predictions, student generated predictions and so on. The advantage of a diffusion model is that such predictions can become more sophisticated and useful as more data is obtained. 

The second area is in the development of knowledge graphs to store and manipulate the data. Although in principle this is simply a data processing problem and with new techniques solutions will be found there is a deeper problem. The ability to connect between ideas is limited due to the structure of a KG database. Unless a novel way of connecting like the "Query," "Key," and "Value" in a transformer, KGs may become a dead end.


By Dr Mark Burgin BM BCh MA (oxon) MRCGP

Dr Mark Burgin graduated from Oxford University in 1987 and studied with the Open University on two occasions in the 1990s. He has also studied for the CPE (law), Medical Ethics, learned Portuguese by living in Brazil. He has written many articles and written books on Personal Injury and the LLMS (your PGCME) and is about to publish a book on Disability. His next book is on psychological techniques.

September 2025

Would you like to contribute an article towards our Professional Knowledge Bank? Find out more.